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INTRODUCTION 


The  early  molecular  events  that  lead  to  sporadic  prostate  cancer  progression  are  largely  unknown.  Human  DNA 
polymerases  beta,  eta,  and  kappa  are  distributive  error-prone  enzymes  that  function  in  a  relatively  accurate 
manner  when  replicating  damaged  DNA.  We  set  out  to:  (1)  identify  common  somatic  variants  in  the  human  pol 
beta,  pol  eta  and  pol  kappa  genes  in  men  suffering  from  prostate  cancer.  (2)  measure  the  frequency  and 
distribution  by  tumor  stage,  tumor  grade  and  patient  age  of  each  of  the  variants  identified  in  (1)  in  human 
prostate  cancer  tissues.  (3)  To  determine  the  effect  of  the  somatic  variants  identified  in  (1)  on  polymerase 
function. 

BODY 

In  Task  1,  we  set  out  to  identify  common  somatic  variants  in  the  human  pol  beta,  pol  eta  and  pol  kappa  genes  in 
men  suffering  from  prostate  cancer.  We  thus  sequenced  the  coding  sequence  of  the  pol  (3,  r|  and  k  genes  for 
somatic  mutations  in  40  prostate  cancer  tissues,  to  test  the  potential  merit  of  our  hypothesis  that  prostate  cancer 
tissue  is  commonly  mutated  in  these  genes.  Briefly,  we  PCR  amplified  all  exons  using  DNA  isolated  from 
microdissected  prostate  cancer  tissue  and  then  sequenced  the  PCR  products  bi-directionally  with  BigDye 
chemistry  on  an  ABI  sequencer  (detailed  in  Makridakis  et  al,  2009;  see  Appendix).  We  have  completed  the  pol 
beta  gene,  finished  all  but  one  exon  for  pol  kappa  gene,  and  have  also  analyzed  3  exons  of  the  pol  eta  gene,  so 
far.  We  identified  many  somatic  mutations  in  these  samples,  most  of  them  missense  (Table  1,  and  Makridakis  et 
al ,  2009). 

Although  the  above  strategy  is  important  for  the  discovery  of  biomarkers  of  prostate  tumor  progression,  it 
would  also  be  highly  beneficial  to  discover  common  biomarkers  of  pre-invasive  prostate  cancer,  or  even 
precancerous  lesions,  for  maximum  reduction  of  mortality  from  this  disease.  Thus  we  set  out  to  genotype  40 
prostatic  intraepithelial  neoplasia  (PIN;  a  precursor  to  prostate  cancer)  tissues  for  the  presence  of  the  pol  beta 
missense  variants  we  identified  (Table  1).  PIN  and  adjacent  normal  prostate  tissues  were  removed  from  prostate 
tissue  slides  donated  by  Dr.  Zongbing  You  of  Tulane  University,  by  laser-captured  microdissection.  Genomic 
DNA  was  extracted  from  the  PIN  tissues  by  the  Picopure®  DNA  Extraction  Kit  (Molecular  Devices;  Sunnyvale, 
CA),  and  PCR  amplified  using  the  same  primers  that  identified  the  POLB  mutations  shown  on  Table  1. 
However,  all  PCR  amplifications  we  tested  failed,  while  our  positive  controls  worked  (data  not  shown).  We  also 
failed  to  detect  DNA  in  our  PIN  tissues  using  Picogreen  (Promega  Co;  Madison,  WI).  These  data  suggest  that 
we  did  not  recover  enough  DNA  from  our  PIN  tissues  to  get  adequate  PCR  amplification.  This  is  not  surprising, 
since  PIN  tissue  is  usually  comprised  of  a  single  layer  of  epithelial  cells  lining  the  prostate  gland. 

Given  the  high  number  of  somatic  mutations  identified  (including  many  missense  mutations;  Table  1),  we 
decided  to  determine  the  effect  of  these  mutations  on  biochemical  activity  (Task  3)  prior  to  establishing  their 
frequency  in  sporadic  prostate  tumors  (Task  2)  (genotyping  a  high  number  of  mutations  makes  more  sense 
when  one  knows  that  a  significant  proportion  of  these  mutations  are  not  mere  passengers  of  cancer  evolution). 

We  thus  set  out  to  biochemically  characterize  each  missense  polymerase  beta  variant  identified  in  Task  1  (Table 
1).  We  initially  focused  on  missense  mutations  because  they  have  a  higher  chance  of  causing  functional  effects. 
Briefly,  we  reconstructed  all  missense  pol  (3  variations  in  an  appropriate  expression  vector  and  then  measured 
the  effect  of  each  mutation  on  enzyme  activity,  protein  expression  and  fidelity  of  DNA  synthesis  in  vitro ,  after 
purification  of  the  respective  overexpressed  enzymes  (detailed  in  An  et  al,  2011;  see  Appendix). 
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Gene 

(accesion  #) 

Patient 

DNA  change 

Type  of  Change/ 

location 

Mutant 

peak 

(%) 

Predicted/  known  effect 

POLB 

( AF49 1812.1) 

1 

g.23912G>A 

AG-splice  junction 

59% 

Deletion  of  amino  acids  1 84- 
185 

2 

g.25254G>A 

E216K 

52% 

4 

g.960C>T 

5-UTR 

52% 

g.  18145T>C 

N128N 

54% 

5 

g.32481C>T 

P261L 

100% 

Altered  fidelity 

g.32573A>G 

T292A 

100% 

Altered  fidelity 

g.32592T>C 

1298T 

100% 

g.32439G>A* 

Intron  12 

79% 

8 

g.  1 1 628T>C 

Intron  3 

27% 

10 

g.25314A>T 

M236L 

38% 

Altered  fidelity 

11 

g.  15 180G>A 

E123K 

100% 

12 

g,1444G>C 

K27N 

60% 

13 

g.25236C>T 

L210L 

42% 

g.3 1 9 1  lOG** 

P242R 

40% 

Altered  fidelity 

15 

g.921C>T 

Promoter 

100% 

20 

g.25302G>A 

E232K 

100% 

23 

g,12622A>G 

AG-splice  junction 

25% 

Deletion  of  amino  acids  88- 
90 

24 

g.  1 1630T>C 

Intron  3 

30% 

29 

g.32521G>T 

G274G 

35% 

30 

g.32467T>C 

Intron  12 

51  % 

g.32439G>A* 

Intron  12 

100% 

POLH 

(AY388614.1) 

1 

g.28891G>A 

G259R 

23% 

6 

g.29021  A>T 

Intron  7 

46% 

20 

g.28901G>C 

G263A 

60% 

XPV 

g.28967C>T 

S284F 

66% 

POLK 

(AY273797.1) 

1 

c.560T>C 

F155S 

45% 

4 

c.507T>C 

S137S 

35% 

c.  557G>A 

G154E 

100% 

6 

c.l420C>T 

L442F 

30% 

10 

c.  1353G>A 

E419E 

100% 

12 

g.67088C>A 

T205K 

28% 

21 

g.66992A>G 

Intron  5 

68% 

Changes  Lariat-A 

27 

g.67088C>T 

T205I 

30% 

29 

c.  1380G>A 

A428A 

100% 

5 


c.  1385A>G 

E430G 

100% 

31 

c.  2416G>A 

G774S 

60% 

c.  2441G>A 

C782Y 

50% 

33 

c.  2432G>A 

C779Y 

43% 

c.  2567G>A 

S824N 

45% 

c.  2694T>G 

D866E 

23% 

34 

c.  425C>T 

A 1 1 0  V 

20% 

c.  469G>A 

D125N 

19% 

c.  482C>T 

A 129  V 

22% 

c.  2186C>T 

S697L 

50% 

35 

c.  433G>A 

A113T 

20% 

c.  1437G>A 

Q447Q 

100% 

c.  1441G>A 

E449K 

100% 

*Creates  a  new  branch  (Lariat-A)  site 
**  Also  in  constitutional  DNA  (rs3136797) 

Table  1.  Somatic  mutations  detected  in  polymerase  genes  in  prostate  cancer  patients. 

Notes:  Mutations  not  previously  reported  are  shown  in  bold.  %  Mutant  indicates  the  sequencing  peak  height  that  corresponds  to  the 
mutant  (average  of  forward  and  reverse  sequence).  For  both  mutations  that  change  the  invariant  splice  junction  (AG),  an  in-frame  AG 
exists  shortly  downstream.  Utilization  of  the  alternative  AG  is  predicted  to  result  in  the  deletions  shown  in  the  last  column.  XPV 
denotes  a  pol  r|  residue  mutated  in  an  XPV  patient.  For  further  details,  see  text. 


2 


WT  K27N  E123K  E232K  P242R  E216K  M236L  Triple 

Fig.  1.  Influence  of  POLB  variants  on  cataly  tic  efficiency  for 
dCTP  insertion.  WT  and  mutant  variants  were  assayed  on  a  gapped 
oligo  substrate  with  a  tcmplating  dG,  and  the  catalytic  efficiency 
(kcJK,„  dNTp)  were  determined  from  dividing  kcu  and  Km  values 
obtained  by  Excel  analysis  of  the  inverse  plots.  These  data  represent 
the  mean  of  at  least  three  independent  determinations. 


In  vitro  biochemical  (polymerase  beta)  assays 
for  both  wild  type  (WT)  and  mutant  variants 
were  performed  on  a  single-gapped 
oligonucleotide  substrate  (the  natural 
polymerase  beta  substrate)  with  a  templating 
deoxy-G.  The  experiments  were  performed 
at  steady-state  conditions  for  the  enzyme,  and 
the  catalytic  efficiency  of  each  variant  was 
determined,  as  a  measure  of  enzyme  activity. 
The  data  comparing  WT  to  mutant 
efficiencies  are  shown  in  Figure  1. 

The  data  presented  in  Figure  1  indicate  that 
two  of  the  pol  (3  variants  significantly  reduce 
catalytic  efficiency  (K27N  and  Triple  mutant: 


P261L/  T292A/  I298T).  The  remaining  variants  contain  wild  type  activity. 

Some  mutations  may  be  active,  and  yet  still  affect  mutagenicity  by  altering  the  fidelity  of  DNA  synthesis.  We 
thus  proceeded  to  characterize  the  effect  of  all  POLB  variations  on  DNA  synthesis  fidelity  using  the  same 
single-gapped  oligonucleotide  substrate  with  a  templating  dG,  (detailed  in  An  et  al,  2011;  see  Appendix). 
Fidelity  measurements  were  performed  with  each  of  the  four  dNTPs.  The  data,  shown  in  Figure  2,  indicate  that 
the  majority  of  the  POLB  somatic  mutations  affect  the  fidelity  of  DNA  synthesis.  Thus,  two  of  the  pol  (3 
variants  assayed  reduce  catalytic  efficiency  (Fig.  1),  while  the  remaining  five  missense  mutations  alter  the 
fidelity  of  DNA  synthesis  (Fig.  2). 
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Fig.  2.  Relative  fidelity  of  variants  enzyme.  WT  and 

mutant  variants  were  assayed  on  single-nuclcotide  gapped 
DNA  substrate  with  a  templating  dG.  Fidelity-  t(^ca/ 

Ah .d.NTP  f  orrcvt  (^ta/AjNTpliiicoiTCCtl^ca/AjilNTp)  incorrect  * 

These  data  represent  the  mean  of  at  least  three  independent 
determinations. 


However,  fidelity  depends  on 
sequence  context,  and  we  have  thus 
far  analyzed  the  mutants  on  a 
specific  sequence  context  (An  et  al, 
2011;  see  Appendix).  Therefore,  in 
order  to  better  predict  the  mutagenic 
effect  of  these  somatic  POLB 
mutations,  we  plan  to  utilize  the 
Forward  Mutation  Assay  (Bebenek 
et  al,  2003).  This  assay  relies  on 
scoring  errors  produced  by  a  DNA 
polymerase  while  filling  a  407 
nucleotide  gap  on  the  lacZ  a  gene, 
and  thus  can  identify  all  types  of 


mutations  (deletions,  insertions,  substitutions,  etc.)  in  different  sequence  contexts.  DNA  synthesis  produces 


M13mp2  DNA  that  yields  dark  blue  phage  plaques  upon  introduction  into  an  Escherichia  coli  a- 


complementation  strain  and  plating  on  indicator  plates.  Errors  are  scored  as  light  blue  or  colorless  mutant 


plaques.  These  placques  are  restreaked,  to  verify  that  they  contain  mutations.  DNA  from  independent  mutant 


clones  is  then  isolated  and  sequenced  to  define  the  lacZ  mutation  (Bebenek  et  al,  2003). 


Unlike  the  remaining  POLB  variants  analyzed,  the  K27N  missense  mutation  is  part  of  the  deoxyribose 
phosphate  (dRP)  lyase  domain  of  polymerase  beta  (which  is  responsible  for  the  removal  of  the  5  '-dRP 
intermediate  formed  during  base  excision  repair)  (Dalai  et  al,  2008).  Thus  we  compared  the  dRP  lyase  activity 
of  K27N  to  wild  type,  using  the  dRP  lyase  assay  (detailed  in  An  et  al,  2011;  see  Appendix).  The  results  showed 
that  the  K27N  mutant  significantly  decreased  both  the  Km  and  kcat,  resulting  in  a  small  decrease  in  catalytic 
efficiency,  KCJ  Km  (Table  2). 


Enzyme 

Kcat  (min'1) 

Km(nM) 

Kcat/Km  (nM  1  min  ') 

WT 

4. 1x10 3 

262 

1.6x10 5 

p.K27N 

1.1x10 3 

77 

1.4x10 5 

Table  2.  Steady-state  kinetic  parameters  for  pol  beta  deoxyribose  lyase  activity 

Note:  the  kinetic  values  presented  are  calculated  as  detailed  in  An  et  al,  2011;  see  Appendix. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


•  Identification  of  common  somatic  variants  of  the  human  pol  (3,  pol  q  and  pol  k  genes  in  prostate  cancer 
tissue 

•  Characterization  of  the  biochemical  effects  of  all  missense  pol  (3  prostate  cancer  tissue  variations 
identified 


REPORTABLE  OUTCOMES 
Manuscripts 

An  C,  Chen  D.,  and  Makridakis  NM.  (2011).  Systematic  Biochemical  Analysis  of  Somatic  Missense  Mutations 
in  DNA  Polymerase  (3  Found  In  Prostate  Cancer  Reveal  Alteration  of  Enzymatic  Function.  Hum  Mutat,  32: 
415-423. 

Abstracts 


Chen  D,  Mukhopadhyay  S.,  Yadav  S.,  and  Makridakis  NM.  (2011).  Polymerase  genes,  genomic  instability  and 
Prostate  Cancer,  LCRC  Annual  Scientific  Retreat ,  Xavier  University ,  New  Orleans,  LA. 

Presentations 


Error-prone  polymerase  mutations  and  prostate  cancer  progression,  COBRE/Cancer  Genetics  group  seminar, 
Tulane  University,  New  Orleans,  LA,  08/10 

Candidate  genes,  genomic  instability  and  cancer  progression,  Genetics  seminar,  Tulane  University,  New 
Orleans,  LA,  03/1 1 

Error-prone  polymerases,  genomic  instability  and  prostate  cancer  progression,  COB  RE/  EAB  presentation, 
Tulane  University,  New  Orleans,  LA,  03/1 1 

Funding  applied  for  based  on  work  supported  by  this  award 

Polymerase  genes  and  genomic  instability  in  prostate  cancer  progression,  NCI/R01,  10/2010 


CONCLUSION 

Overall,  our  DNA  sequencing  analysis  so  far  shows  that  73%  of  the  40  prostate  cancer  patients  we  examined 
had  somatic  substitutions  in  at  least  one  of  the  polymerase  genes  tested  (Table  1).  Therefore,  error-prone 
polymerase  genes  are  commonly  mutated  in  prostate  tumor  tissue. 

Furthermore,  biochemical  analysis  of  the  pol  beta  missense  variations  showed  that  all  of  these  somatic 
mutations  have  functional  effects,  either  by  reducing  enzymatic  activity  (Fig.  1),  or  by  altering  the  fidelity  of 
DNA  synthesis  (Fig.  2).  Thus,  this  research  has  uncovered  the  novel  concept  that  functional  alteration  of  error- 
prone  DNA  polymerases  may  be  common  and  important  for  prostate  cancer  progression. 
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"So  what?" 

Given  the  prevalence  of  functionally  important  mutations  of  polymerase  beta  in  prostate  tumors,  we  propose 
that  screening  for  these  somatic  mutations  in  prostate  biopsies  may  be  an  important  biomarker  for  the  detection 
of  prostate  cancer.  Characterization  of  these  mutations  in  additional  tumors  is  expected  to  determine  the 
potential  benefit  of  those  mutations  as  biomarkers  of  disease  outcome.  Similar  analysis  should  be  performed  for 
the  other  two  polymerase  genes,  as  proposed  in  our  plan.  This  approach  may  eventually  lead  to  the 
characterization  of  a  novel  mechanism  for  prostate  carcinogenesis  and  progression,  involving  the  accumulation 
of  mutations  resulting  from  a  defective  polymerase  genotype. 
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ABSTRACT:  Somatic  mutations  are  hallmarks  of  cancer 
progression.  We  sequenced  26  matched  human  prostate 
tumor  and  constitutional  DNA  samples  for  somatic 
alterations  in  the  SRD5A2,  HPRT,  and  HSD3B2  genes, 
and  identified  71  nucleotide  substitutions.  Of  these 
substitutions,  79%  (56/71)  occur  within  a 

WKVnRRRnVWK  sequence  (a  novel  motif  we  call 
THEMIS  [from  the  ancient  Greek  goddess  of  prophecy] : 
W  =  A/T,  K  =  G/T,  V  =  G/A/C,  R  =  purine  (A/G),  and 
n  =  any  nucleotide) ,  with  one  mismatch  allowed.  Litera¬ 
ture  searches  identified  this  motif  with  one  mismatch 
allowed  in  66%  (37/56)  of  the  somatic  prostate  cancer 
mutations  and  in  74%  (90/122)  of  the  somatic  breast 
cancer  mutations  found  in  all  human  genes  analyzed.  We 
also  found  the  THEMIS  motif  with  one  allowed 
mismatch  in  88%  (23/26)  of  the  rasl  gene  somatic 
mutations  formed  in  the  sensitive  to  skin  carcinogenesis 
(SENCAR)  mouse  model,  after  induction  of  error-prone 
DNA  repair  following  mutagenic  treatment.  The  high 
prevalence  of  the  motif  in  each  of  the  above  mentioned 
cases  cannot  be  explained  by  chance  (P  <0.046).  We 
further  identified  27  somatic  mutations  in  the  error- 
prone  DNA  polymerase  genes  pol  r\,  pol  k,  and  pol  p  in 
these  prostate  cancer  patients.  The  data  suggest  that  most 
somatic  nucleotide  substitutions  in  human  cancer  may 
occur  in  sites  that  conform  to  the  THEMIS  motif.  These 
mutations  may  be  caused  by  “mutator”  mutations  in 
error-prone  DNA  polymerase  genes. 
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Introduction 

Cancer  is  thought  to  evolve  through  the  accumulation  of 
somatic  mutations  in  specific  genes,  depending  on  tumor  type 
[Vogelstein  and  Kinzler,  2004].  These  mutations  are  caused  by  a 
combination  of  environmental  and  heritable  factors  [Lichtenstein 
et  al.,  2000].  To  date  scientists  have  been  unable  to  identify  a 
common  motif  at  the  sites  of  these  somatic  mutations,  suggesting 
that  these  somatic  events  have  distinct  molecular  etiologies, 
depending,  among  other  factors,  on  the  individual  and  the  type  of 
tumor.  We  tested  the  hypothesis  that  there  is  a  common  motif  at 
the  sites  of  somatic  mutations  in  prostate  cancer  tissue,  a  tumor 
type  with  poorly  understood  etiology. 

We  screened  the  HSD3B2,  SRD5A2,  and  HPRT  genes  for 
somatic  mutations.  The  SRD5A2  gene  (MIMit  607306)  encodes 
the  steroid  5a-reductase  type  II  enzyme  that  reduces  testosterone 
(T)  to  dihydrotestosterone  (DHT),  the  most  active  androgen  in 
the  prostate  [Cheng  et  al.,  1993].  Thus  activating  5a-reductase 
mutations  may  contribute  to  prostate  tumor  development.  A 
possible  contribution  of  the  SRD5A2  gene  to  prostate  tumor 
progression  was  also  proposed  based  on  our  findings  of  de  novo 
somatic  events  at  the  SRD5A2  locus  [Akalu  et  al.,  1999a].  More 
recently,  we  reported  the  biochemical  characterization  of  somatic 
SRD5A2  mutations  in  human  prostate  cancer  tissue  [Makridakis 
et  al.,  2004],  including  mutations  that  increase  enzyme  activity. 
The  HSD3B2  gene  (MIMJt  109715)  also  presents  an  attractive 
androgen  metabolic  candidate  gene  for  prostate  cancer  risk  and 
progression.  The  type  II  3fS-hydroxysteroid  dehydrogenase  enzyme 
encoded  by  the  HSD3B2  gene  is  mainly  expressed  in  androgenic 
tissues  [Labrie  et  al.,  1992;  Lachance  et  al.,  1991],  and  initiates 
DHT  inactivation  [Cheng  et  al.,  1993].  However,  if  a  specific 
mechanism  generates  somatic  mutations  at  distinct  motifs,  then  it 
should  do  so  even  in  genes  expected  to  be  unrelated  to  tumor 
progression,  such  as  the  human  HPRT  gene.  The  ubiquitously 
expressed  HPRT  gene  (MIMfts  308000  and  300322)  is  located  on 
the  X  chromosome  [Pai  et  al.,  1980]  and  encodes  hypoxanthine- 
guanine  phosphoribosyl  transferase,  an  enzyme  essential  for  the 
purine  salvage  pathway,  responsible  for  90%  of  nucleic  acid 
biosynthesis  in  normal  cells  [Zoref  and  Sperling,  1979]. 
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We  report  here  71  somatic  mutations  in  these  three  genes. 
Analysis  of  the  nucleotide  sequence  surrounding  these  mutations 
revealed  that  these  alterations  often  occur  within  a  novel  motif  we 
call  THEMIS  (from  the  ancient  Greek  goddess  of  prophecy).  In 
search  of  a  potential  mutagenic  mechanism  we  found  that  the 
THEMIS  motif  commonly  occurs  at  the  sites  of  somatic  mutations 
induced  by  error-prone  (EP)  DNA  repair  in  the  sensitive  to  skin 
carcinogenesis  (SENCAR)  mouse  skin  model  following  mutagenic 
treatment  [Chakravarti  et  al.,  2000,  2001]. 

EP  DNA  repair  (also  called  translesion  synthesis  [Goodman, 
2002])  involves  DNA  polymerases  such  as  pol  q  (MIM#  603968) 
and  pol  k  (MIM#  605650),  which  are  much  more  accurate  when 
replicating  through  specific  types  of  DNA  damage  than  unda¬ 
maged  DNA  [Pages  and  Fuchs,  2002].  EP  DNA  polymerases  have 
been  proposed  to  play  a  role  in  cancer  etiology  [see  Kunkel,  2003]. 
To  date,  the  only  known  example  of  an  EP  DNA  polymerase 
causing  human  cancer  comes  from  pol  q  [Kunkel,  2003]; 
constitutional  DNA  mutations  that  inactivate  pol  q  are  associated 
with  a  high  rate  of  skin  cancers  in  patients  suffering  from  XPV 
(xeroderma  pigmentosum  variant  [Johnson  et  al.,  1999]).  In  the 
absence  of  pol  q,  ultraviolet  (UV)  radiation-induced  pyrimidine 
dimers  are  bypassed  in  a  manner  that  generates  the  mutations 
which  lead  to  skin  cancer  [Kunkel,  2003],  perhaps  by  another  EP 
DNA  polymerase  [Pages  and  Fuchs,  2002],  Thus  EP  polymerase 
mutations  can  result  in  multiple  tumor-inducing  mutations 
(mutator  phenotype  [Loeb  et  al.,  2003])  given  the  right  type 
(and  amount)  of  environmental  exposure. 

Human  pol  (3  (MIM#  174760)  is  a  DNA  polymerase  essential 
for  base  excision  repair  (BER)  [Sancar  et  al.,  2004].  BER  is  one  of 
the  major  pathways  of  DNA  repair  that  removes  oxidized  and 
alkylated  bases  from  DNA  [Friedberg,  2003].  Pol  P  is  not  a  classic 
error  prone  polymerase,  yet  it  causes  67  times  more  substitution 
errors  than  mammalian  pol  8  [Kunkel,  2003].  Pol  P  is  also 
involved  in  meiotic  recombination  [Goodman,  2002]  and  repair 
of  double-stranded  DNA  breaks  through  the  process  of  non- 
homologous  end-joining  [Wilson  and  Lieber,  1999].  Interestingly, 
both  the  pol  k  and  pol  P  genes  are  located  in  chromosomal 
regions  known  to  be  lost  during  prostate  cancer  progression  (5q 
and  8p,  respectively;  e.g.,  Visakorpi  et  al.  [1995]).  Thus,  somatic 
loss  of  these  polymerases  may  contribute  to  prostate  tumor 
progression. 

We  screened  the  pol  p,  q,  and  k  genes  for  somatic  prostate 
cancer  mutations,  to  test  the  hypothesis  that  EP  polymerases  are 
commonly  mutated  in  prostate  cancer  tissue.  We  report  27 
somatic  mutations  in  these  EP  polymerase  genes  in  the  same 
prostate  cancer  tissues  that  have  additional  somatic  mutations  in 
the  other  analyzed  genes  reported  here  (i.e.,  HSD3B2,  HPRT,  and 
SRD5A2). 

Materials  and  Methods 

Tumor  Specimens 

We  analyzed  26  patients  of  Caucasian  background  with 
prostatic  adenocarcinoma  (for  further  description  see  Akalu 
et  al.  [1999a]  and  Makridakis  et  al.  [2004]).  These  patients 
underwent  radical  prostatectomy  at  the  USC  Norris  Comprehen¬ 
sive  Cancer  Center.  Prostate  tissue  and  blood  were  collected  from 
each  patient.  Tumors  were  staged  according  to  the  tumor,  nodes, 
metastases  (TNM)  staging  system  (see  Supplementary  Table  Sla; 
available  online  at  http://www.interscience.wiley.com/jpages/1059- 
7794/suppmat)  [Schroder  et  al.,  1992].  Local  Institutional  Review 
Board  (IRB)  approval  was  obtained  before  study  initiation. 


Microdissection  and  DNA  Extraction 

Specimens  were  formalin  fixed,  embedded  in  paraffin,  sec¬ 
tioned,  and  transferred  on  microscopic  slides,  where  they  were 
deparaffinized  and  stained  with  hematoxylin  and  eosin.  Selected 
populations  of  carcinoma  cells  were  microdissected  and  tumor 
DNA  was  then  extracted  from  the  microdissected  cells  using  a 
method  reported  by  us  earlier  [Akalu  and  Reichardt,  1999b].  As 
control,  normal  (constitutional)  DNA  was  extracted  either  from 
microdissected  normal  cells  adjacent  to  the  tumor  or  from 
peripheral  blood  leukocytes  (or  both). 

Molecular  Analysis 

PCR 

The  entire  coding  region  of  the  HSD3B2  gene,  together  with  the 
exon-intron  splicing  junction  boundaries,  the  putative  promoter 
region,  and  the  5'  and  3'  untranslated  regions  (UTRs),  were 
amplified  by  PCR  reactions  using  sets  of  primers  as  previously 
described  [Rheaume  et  al.,  1992;  Simard  et  al.,  1993].  Purified 
(desalted)  oligonucleotides  were  obtained  from  IDT  (Coralville, 
IA).  Reactions  were  performed  with  the  polymerases  AmpliTaq 
Gold  (Applied  Biosystems  [ABI],  Foster  City,  CA)  or  HotStart  Taq 
(Qiagen,  Valencia,  CA)  with  their  corresponding  PCR  buffer. 
Reactions  carried  out  with  HotStart  Taq  had  an  additional  reagent 
(5  x  Q-solution)  provided  by  the  manufacturer.  The  reaction 
mixture  consisted  of  10  to  20  ng  of  DNA,  1  x  PCR  buffer,  1.5  mM 
MgCl2,  200  pM  dNTPs,  0.1  to  0.2  pM  of  each  forward  and  reverse 
primer,  1  x  Q-solution  (when  necessary),  and  2.5  units  of 
polymerase  and  sterile  water,  in  a  final  volume  of  50  pL.  The 
reaction  was  then  covered  with  mineral  oil  and  subjected  to 
thermal  cycling  in  a  RoboCycler  Gradient  40  (Stratagene,  La  Jolla, 
CA)  under  the  following  conditions:  an  initial  pre-PCR  heat  step 
of  95°C  for  2  min  (AmpliTaq  Gold)  or  15  min  (HotStart  Taq),  50 
cycles  of  denaturation  at  95°C  for  1.5  min,  annealing  at  58  to  70°C 
for  1.5  min,  and  elongation  at  72°C  for  1.5  min.  This  was  followed 
by  a  final  extension  step  at  72°C  for  10  min.  PCR  products  were 
purified  with  the  QIAquick  Gel  Extraction  Kit  (Qiagen)  and  then 
sequenced. 

Exons  7-8  of  the  HPRT  gene  were  PCR  amplified  as  a  single 
379-bp  fragment  as  described  above,  except  that  we  used  primers 
described  elsewhere  [Liu  et  al.,  2003].  Sequencing  analysis  of  the 
HPRT  gene  was  performed  as  in  the  following  paragraphs.  The 
original  analysis  of  the  SRD5A2  gene  is  described  in  Makridakis 
et  al.  [2004], 

The  EP  polymerase  genes  pol  P,  pol  q,  and  pol  k  were  PCR- 
amplified  as  described  above,  except  that  we  used  the  primers 
shown  in  Supplementary  Table  Sib. 

Sequencing 

Sequencing  reactions  consisted  of  20  to  60  ng  of  the  purified 
PCR  product,  3.2  pmol  of  each  PCR  primer,  4  pL  of  5  x 
sequencing  reaction  buffer  (ABI),  4pL  of  ABI  PRISM™  Dye 
Terminator  Cycle  Sequencing  Ready  Reaction  Mix  (ABI),  and 
sterile  water  in  a  total  volume  of  20  pL.  Reactions  were  submitted 
to  the  following  thermal  cycle:  96°C  for  30  s,  50°C  for  15  s,  and 
60°C  for  4  min,  for  a  total  of  30  cycles.  The  PCR  reactions  were 
then  purified  according  to  the  manufacturer’s  recommendations 
and  submitted  to  electrophoresis.  Nucleotide  sequences  were 
collected  on  either  an  ABI  PRISM  377  or  a  3100  Automated  DNA 
Sequencer  (ABI).  Results  were  processed  with  the  ABI  PRISM 
Sequence  Navigator  software  (ABI).  Nucleotide  substitutions  were 
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identified  and  quantitated  by  ABI  PRISM  Factura  Feature 
Identification  software  with  “identify  heterozygote  limit”  set  at 
10%  (ABI).  Each  nucleotide  substitution  was  confirmed  by  at  least 
three  independent  PCR  sequencing  analyses. 

Nucleotide  changes  are  numbered  according  to  the  appropriate 
genomic  reference  sequence,  as  noted,  according  to  journal 
guidelines  (www.hgvs.org/mutnomen). 

Microsatellite  analysis 

The  complex  dinucleotide  repeat  in  intron  3  of  the  HSD3B2  gene 
was  amplified  by  PCR  using  a  fluorescent  primer.  Tumor  DNA  and 
its  matched  normal  DNA  were  amplified  by  PCR  using  the  set  of 
primers  previously  described  [Verreault  et  al„  1994].  In  some  cases, 
the  alternative  reverse  primer  5'-TGGACCTATGTTTGTGTGTGG-3' 
was  utilized,  yielding  fragments  96  bp  shorter  than  those  described 
by  Verreault  et  al.  [1994].  The  forward  primer  [Verreault  et  al., 
1994]  was  labeled  on  the  5'  end  with  the  fluorescent  dye  TET™. 
PCR  reactions  were  performed  with  HotStart  Taq  as  described 
above  and  submitted  to  the  following  procedure:  initial  denatura- 
tion  step  for  15  min  at  95°C,  50  cycles  with  a  denaturation  step  at 
95°C  for  1  min,  annealing  at  62°C  for  1  min,  and  extension  at  72°C 
for  1  min,  followed  by  a  final  extension  step  at  72°C  for  10  min. 
PCR  products  were  loaded  on  an  ABI  PRISM  377  Automated 
DNA  Sequencer  together  with  an  internal  size  standard  (Genescan- 
500™  tetramethylrhodamine  [TAMRA];  ABI,  Foster  City,  CA), 
according  to  the  manufacturer’s  recommendations.  Genotyping 
analyzes  were  performed  using  the  Genescan  software  (ABI). 
Experiments  were  performed  in  triplicate. 

Statistics 

All  P  values  were  calculated  using  the  chi-square  test,  with  the 
Yates  correction  when  appropriate  (for  one  degree  of  freedom). 

Determination  of  the  Actual  Target  of  the  THEMIS  Motif 

For  each  gene,  we  analyzed  the  sequence  that  contained  the 
somatic  mutations  (usually  the  sequence  that  was  PCR-amplified, 
except  for  androgen  receptor  (AR)  and  TP53,  for  which  we  used 
the  cDNA  sequence  with  the  addition  of  short  patches  of  intronic 
sequence,  for  the  mutations  that  were  on  intron-exon  bound¬ 
aries).  The  analysis  was  performed  using  a  software  program  that 
finds  known  motifs  in  DNA  sequence  imported  by  the  user,  called 
“dna-pattern  (strings)”  (available  at:  http://rsat.ulb.ac.be/rsat/dna- 
pattern_form.cgi).  We  ran  this  program  online  using  both  0  and  1 
allowed  substitutions  from  the  motif,  and  analyzed  the  results. 

Results 

Our  prostate  cancer  samples  were  previously  described  [Akalu 
et  al.,  1999a].  Supplementary  Table  Sla  summarizes  patient 
information  including  age  and  tumor  stage  as  well  as  all  results 
concerning  microsatellite  instability  (MSI)  and  nucleotide  sub¬ 
stitutions  in  the  HSD3B2  gene.  These  results  are  presented  in 
detail  in  the  sections  that  follow. 

Somatic  Mutations  in  the  HSD3B2  and  HPRT  Genes  in 
Prostate  Cancer  Tissue 

If  there  is  a  common  somatic  mutation  motif  in  prostate  cancer 
tissue,  then  it  should  be  ubiquitous.  We  thus  decided  to  screen 
both  genes  containing  potential  “driver”  mutations  (such  as 
HSD3B2;  Makridakis  et  al.  [2005])  and  passenger  mutations  (such 
as  HPRT).  Initially,  the  tumor  DNA  of  each  patient  was  screened 


for  somatic  alterations,  by  automated  DNA  sequencing.  Once 
nucleotide  alterations  were  detected,  the  constitutional  DNA  of 
the  same  patient  was  also  analyzed  to  examine  if  the  event  was 
somatic.  Somatic  events  were  quite  common.  In  fact,  among  the 
26  patients  studied,  only  five  had  no  detectable  nucleotide 
HSD3B2  alterations  in  their  tumor  (Supplementary  Table  Sla). 
We  detected  38  single-nucleotide  somatic  mutations  total  in  this 
gene:  one  deletion  and  37  nucleotide  substitutions  (Supplemen¬ 
tary  Table  Sla).  The  HPRT  gene  was  also  investigated  for  somatic 
alterations  in  prostate  cancer  by  DNA  sequencing:  17  somatic 
nucleotide  substitutions  were  identified  (Table  1).  Although  only  a 
fraction  of  the  HPRT  gene  was  screened,  9  out  of  the  26  patients 
analyzed  harbor  somatic  nucleotide  substitutions  (Table  1). 

Nucleotide  Sequence  Context  and  Nature  of  the  Somatic 
Mutations 

The  high  number  of  somatic  mutations  identified  in  prostate 
cancer  tissue  allowed  us  to  determine  the  sequence  context  of 
these  nucleotide  substitutions.  This  analysis  revealed  that  most 
mutations  occur  within  a  WKV nRRRnVWK  (THEMIS)  motif 
when  one  mismatch  is  allowed  (W=A/T,  K  =  G/T,  V  =  G/A/C, 
R  =  purine  (A/G);  n  =  any  nucleotide;  total  number  of  n  =  0-2 
nucleotides;  the  underline  indicates  the  position  of  the  mutated 
base;  Table  2). 

A  total  of  30  out  of  the  38  (79%)  nucleotide  alterations  detected 
in  the  HSD3B2  gene  fit  the  THEMIS  motif  with  up  to  one 
mismatch  (Supplementary  Table  S2).  The  actual  target  of  the 
motif  (the  RRR  sequences  occurring  within  the  WKVnRRRnVWK 
context  with  one  allowed  mismatch)  covers  60%  of  the  HSD3B2 
sequence  analyzed  (see  Materials  and  Methods  for  calculation). 
Thus,  22.8  (60%)  of  the  HSD3B2  mutations  are  expected  to  occur 
in  the  motif  (with  up  to  one  mismatch),  compared  to  the  30 
(79%)  observed  (P  =  0.026).  We  then  tested  the  occurrence  of  the 
motif  in  the  HPRT  and  SRD5A2  somatic  mutation  sites:  15  out  of 
17  (88%)  of  the  somatic  HPRT  mutations  and  1 1  out  of  16  (69%) 
of  the  previously  reported  [Makridakis  et  al.,  2004]  somatic 
SRD5A2  mutations  fit  the  motif  with  one  allowed  mismatch 
(Table  1;  Supplementary  Table  S3).  The  actual  motif  target  (with 
one  allowed  mismatch)  covers  68%  of  the  HPRT  and  57%  of  the 
SRD5A2  sequences  analyzed  (see  Materials  and  Methods).  Thus 
20.6  (out  of  the  33)  HPRT/  SRD5A2  mutations  are  expected  to  fit 
the  motif  with  one  allowed  mismatch,  compared  to  the  26  (15  + 
11)  observed  (P  =  0.05).  Comprehensive  analysis  of  all  the  HPRT, 
HSD3B2,  and  SRD5A2  somatic  mutations  indicates  that  the 
number  of  mutations  that  fit  the  motif  with  up  to  one  mismatch 
(56)  is  significantly  higher  (P  =  0.005)  than  expected  (43.4),  based 
on  the  number  of  motifs  that  exist  in  these  genes. 

Of  the  54  somatic  SRD5A2  and  HSD3B2  substitutions,  94%  are 
transitions  (Supplementary  Table  Sla)  [Makridakis  et  al.,  2004].  In 
contrast,  only  52%  of  the  63  constitutional  DNA  variations  found 
in  these  genes  in  controls  and  in  patients  with  HSD3B2  deficiency 
[Makridakis  et  al.,  2004;  Pang,  2001]  (and  data  not  shown)  are 
transitions  (P  =  0.000003).  Moreover,  88%  of  the  17  HPRT  somatic 
substitutions  are  transitions  (Table  1).  In  contrast,  only  51%  of  the 
59  listed  HPRT  SNPs  (http://snpper.chip.org/bio/export-sequence/ 
20797)  are  transitions  (P  =  0.019).  Thus  transitions  are  significantly 
more  common  at  the  somatic  mutation  sites. 

The  inclusion  of  potential  driver  mutations  (such  as  the  missense 
SRD5A2  mutations)  [Makridakis  et  al.,  2004]  in  our  analysis  may 
result  in  a  motif  that  is  biased  toward  variants  that  survive  the  tumor 
selection  process  rather  than  the  underlying  mutation  process.  To 
examine  this  possibility  we  analyzed  the  presumed  neutral 
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Table  1.  Somatic  Nucleotide  Substitutions  in  the  HPRT  Gene 


Mutant  nucleotide  peak  (%) 


Patient 

DNA  change 

Type  of  change/location3 

Forward  sequence 

Reverse  sequence 

Sequence  context1* 

i 

g.39835C>Tc 

P169S 

15 

25 

gTG 

GGG 

TC 

CTT 

g.39897C>T 

Intronic 

22 

15 

AGA 

T 

GGt 

TA 

AAT 

g.39985T>C 

Intronic 

34 

25 

AGA 

TT 

tAA 

AAG 

g.40051C>Tc 

P184S 

19 

34 

TGt 

CT 

GGA 

ATT 

4 

g.39835C>Tc 

P169S 

19 

29 

gTG 

GGG 

TC 

CTT 

g.40051C>Tc 

P184S 

23 

34 

TGt 

CT 

GGA 

ATT 

6 

g.39835C>Tc 

P169S 

14 

24 

gTG 

GGG 

TC 

CTT 

g.40051C>Tc 

P184S 

21 

32 

TGt 

CT 

GGA 

ATT 

11 

g.40055A>G 

D185G 

28 

35 

TTC 

C 

AGA 

C 

AAG 

g.40073A>G 

Y191C 

28 

20 

gGA 

T 

AtG 

CC 

CTT 

g.40091A>G 

E197G 

29 

22 

ATG 

AAt 

A 

CTT 

12 

g.39959A>G 

Intronic 

32 

38 

TTG 

T 

AGA 

GAG 

16 

g.39890A>Gc 

Intronic 

33 

32 

TTA 

A 

AtG 

ATT 

g.40086T>C 

Y195Y 

44 

41 

TGA 

CT 

AtA 

ATG 

20 

g.39929G>  A 

Intronic 

18 

19 

TGA 

AA 

tGG 

CTT 

g.40000C  >  Tc 

Intronic 

38 

42 

ATA 

AAG 

AAa 

21 

g.39859G>  A 

D177N 

81 

69 

AGC 

C 

AGA 

CTG 

g.39939G>T 

Intronic 

50 

66 

ATA 

A 

tGG 

CTT 

g.40011T>C 

Intronic 

37 

30 

ATC 

TA 

AAt 

GAT 

27 

g.39907C>Tc 

Intronic 

65 

89 

TTA 

GGt 

TA 

AAG 

g.39971G>Tc 

Intronic 

23 

33 

TGG 

cAA 

ATG 

*The  HPRT  nucleotide  changes  are  numbered  according  to  the  genomic  reference  sequence  (GenBank:  M26434.1). 
aIn  the  “Type  of  change”  column,  missense  mutations  are  indicated  using  the  single  letter  code. 

bIn  the  “Sequence  context”  column,  the  nucleotide  sequence  is  displayed  using  the  DNA  strand  that  rendered  the  best  fit  to  the  THEMIS  motif.  The  mutated  base  is  indicated 
in  bold.  Lowercase  letters  indicate  mismatches  within  the  consensus  motif.  The  spacer  nucleotides  are  underlined.  Sequence  contexts  that  conform  to  the  motif  are  shaded.  The 
mutant  peak  intensity  is  reported  as  percentage  of  total  peak  intensity. 

cMutations  that  fit  the  motif  only  in  the  nontranscribed  strand  (the  rest  of  the  mutations  fit  the  motif  in  the  transcribed  strand). 


Table  2.  Most  Somatic  Mutations  in  Cancer  fit  the  THEMIS 
Motif* 


Breast 

Prostate 

Gene 

%  Fit 

Gene 

%  Fit 

CHEK2 

90 

H-RAS,  K-RAS,  N-RAS 

50 

CSNK1E 

82 

TP53 

67 

GATA3 

67 

PTEN 

100 

LM04 

100 

AR 

68 

CDH-1 

84 

HSD3B2 

79 

DLG1 

67 

SRD5A2 

69 

UBR-5 

80 

HPRT 

88 

BRCA2 

100 

PHB 

60 

PTEN 

64 

TP53 

67 

*The  percentage  of  somatic  mutations  that  fit  the  WKVnRRRnVWK  motif  (with  one 
mismatch  allowed)  is  given,  by  gene  and  type  of  cancer.  Bold  indicates  data  reported 
in  this  report.  The  rest  of  the  data  was  analyzed  from  the  published  literature.  For 
details  and  references  see  Results. 

substitutions  (i.e.,  those  that  do  not  change  the  protein  structure) 
separately:  this  data  suggests  that  neutral  substitutions  in  all  three 
genes  fit  the  motif  well  (P  =  0.005;  Supplementary  Table  S4a)  and 
that  in  fact  there  is  no  significant  difference  in  the  motif-fitting 
ratios  among  presumed  drivers  and  presumed  neutral  substitutions 
(P  =  0.3;  Supplementary  Table  S4b). 

Nucleotide  substitution  rates  vary  according  to  sequence 
context  and  clearly  depend  on  the  nearest  neighbor  nucleotides 
(e.g.,  Lunter  and  Hein  [2004]).  We  examined  whether  an 
established  model  that  predicts  nucleotide  substitution  rates 
based  on  sequence  context  (the  dinucleotide  substitution  model; 
Lunter  and  Hein  [2004])  fits  the  observed  mutation  spectra  better 


than  the  THEMIS  motif.  The  data,  presented  in  Supplementary 
Table  S5,  suggests  that  the  dinucleotide  substitution  model 
predicts  a  much  different  distribution  than  the  observed 
(P<  0.0001).  Thus  the  observed  somatic  mutations  in  all  the 
genes  we  examined  do  not  fit  the  expected  distribution  based  on 
dinucleotide  substitution  model,  but  instead  frequently  fit  the 
THEMIS  motif. 

The  THEMIS  Motif  in  the  Cancer  Literature 

The  discovery  of  the  THEMIS  motif  prompted  us  to  examine 
the  published  literature  for  other  kinds  of  mutations  that  may  fit 
this  motif.  This  search  revealed  that  the  motif  is  found  (with  one 
allowed  mismatch)  in  88%  (23/26)  of  the  rasl  gene  mutations 
detected  in  the  SENCAR  mouse  skin  cancer  model  after 
benzo[ a]  pyrene  [Chakravarti  et  al.,  2000]  or  estradiol-3,  4- 
quinone  [Chakravarti  et  al.,  2001]  treatment  (Supplementary 
Table  S6;  and  data  not  shown).  Benzo[a]pyrene  is  a  known 
etiologic  agent  of  both  skin  and  lung  cancer  [Chakravarti  et  al., 
2000] .  These  mouse  skin  H-ras  mutations  were  the  result  of  EP 
DNA  repair  [Chakravarti  et  al.,  2000,  2001].  Of  the  26  rasl  gene 
mutations,  24  were  also  transitions  (Supplementary  Table  S6;  data 
not  shown).  The  actual  motif  target  covers  62%  of  the  rasl 
sequence  analyzed  (see  Materials  and  Methods).  Thus,  16.2  (62%) 
of  the  rasl  mutations  are  expected  to  occur  in  the  motif  (with  one 
mismatch),  compared  to  the  23  (88%)  observed  (P  =  0.01).  The 
frequency  of  the  rasl  mutations  that  fit  the  exact  motif  (without 
mismatches)  is  also  higher  than  expected  (35%  vs.  21%) 
(Supplementary  Fig.  SI),  but  this  trend  is  not  significant. 

Additional  literature  searches  identified  the  THEMIS  motif  in 
66%  (37/56)  of  the  somatic  prostate  cancer  mutations  that  we  found 
in  TP53,  H-ras,  K-ras,  N-ras,  PTEN,  and  the  AR  gene  (Table  2).  Of 
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these  mutations,  68%  are  transitions.  Examination  of  the  somatic 
mutations  in  the  AR  gene  database  [Gottlieb  et  al.,  2004]  revealed 
that  68%  of  the  mutations  fit  this  motif  (with  up  to  one  mismatch) 
and  80%  are  transitions.  Moreover,  one-half  of  the  most  common 
somatic  mutations  (in  prostate  cancer)  that  activate  the  H-ras,  K- 
ras,  and  N-ras  oncogenes,  at  codons  12,  13,  and  61,  also  fit  the 
THEMIS  motif  with  one  mismatch  allowed.  Regarding  somatic 
substitutions  in  the  TP53  gene  in  prostate  cancer  [Chi  et  al.,  1994], 
there  is  a  prevalence  of  transitions  over  transversions  and  67%  (14/ 
21)  of  the  mutations  fit  this  motif  with  up  to  one  mismatch.  The 
most  common  somatic  PTEN  mutation,  found  in  1  out  of  6  prostate 
cancer  tissues  [Pesche  et  al.,  1998],  also  fits  this  motif.  To  test  the 
significance  of  these  observations,  we  calculated  the  expected 
number  of  the  AR  gene  mutations  that  fit  the  THEMIS  motif 
based  on  the  actual  motif  target  (see  Materials  and  Methods)  in  the 
AR  gene,  the  most  commonly  mutated  gene  in  this  dataset:  42  out  of 
the  62  (68%)  somatic  AR  mutations  [Gottlieb  et  al.,  2004]  fit  the 
motif  with  one  mismatch,  compared  to  32.8  (53%)  expected 
(P  =  0.027).  The  frequency  of  the  somatic  AR  mutations  that  fit  the 
THEMIS  motif  exactly  (without  mismatches)  is  also  higher  than 
expected  (16  observed  vs.  10.4  expected),  but  this  trend  does  not 
reach  statistical  significance  (P  =  0.08).  Moreover,  80%  of  the 
somatic  prostate  cancer  mutations  in  the  AR  gene  [Gottlieb  et  al., 
2004]  are  transitions,  while  only  57%  of  the  constitutional  AR  SNPs 
(http://snpper.chip.org/bio/export-sequence/20479)  are  transitions 
(P  =  0.0054). 

After  detecting  the  THEMIS  motif  in  the  most  common  human 
malignancy  in  U.S.  males  [Jemal  et  al.,  2007],  we  decided  to  also 
examine  the  most  common  malignancy  in  females,  breast  cancer 
[Jemal  et  al.,  2007].  The  data  presented  in  Supplementary  Table  S7 
shows  that  74%  (90/122)  of  all  somatic  breast  cancer  mutations 
found  in  the  literature,  fit  this  motif  with  up  to  one  mismatch.  The 
list  of  analyzed  genes  includes  CHEK2  [Staalesen  et  al.,  2004; 
Sullivan  et  al.,  2002],  CSNK1E  [Fuja  et  al.,  2004],  GATA3  [Usary 
et  al.,  2004],  LM04  [Sutherland  et  al.,  2003],  CDH-1  [Becker  et  al., 
1999;  Berx  et  al.,  1996],  DLG1  [Fuja  et  al.,  2004],  EDD/hHYD  (UBR- 
5)  [Fuja  et  al.,  2004],  BRCA2  [Lancaster  et  al.,  1996],  PTEN  [Kurose 
et  al.,  2002],  TP53  [Sullivan  et  al.,  2002;  Kurose  et  al.,  2002; 
Thorlacius  et  al.,  1993],  and  PHB  [Sato  et  al.,  1992,  1993]  (Table  2). 
To  test  the  significance  of  this  finding,  we  calculated  the  expected 
number  of  somatic  mutations  that  fit  the  THEMIS  motif  in  the  most 
commonly  mutated  gene  in  this  dataset  (TP53),  based  on  the  actual 
motif  target  in  TP53:  29  out  of  the  43  (67%)  somatic  TP53 
mutations  fit  the  motif  with  one  mismatch  allowed,  compared  to 
23.5  expected  (see  Materials  and  Methods).  This  higher  than 
expected  incidence  does  not  reach  statistical  significance  (P  =  0.089). 
However,  10  out  of  the  43  (23%)  somatic  TP53  mutations  fit  the 
motif  exactly,  compared  to  5.6  (13%)  expected,  and  this  higher 
incidence  is  significant  (P  =  0.046).  Moreover,  58%  of  all  somatic 
breast  cancer  mutations  reported  in  these  genes  are  transitions 
(Supplementary  Table  S7).  Of  the  TP53  SNPs  (http://snpper.chi- 
p.org/bio/export-sequence/7966),  55%  are  transitions.  The  difference 
in  the  transition  frequency  between  breast  cancer  and  constitutional 
TP53  substitutions  is  not  significant. 

Recently,  a  GA  pattern  (or  TC  in  the  opposite  DNA  strand;  the 
mutated  base  is  underlined)  was  identified  at  the  sites  of  somatic 
mutations  in  the  protein  kinase  gene  family  in  breast  cancer 
[Stephens  et  al.,  2005].  Interestingly  this  pattern  emerged  mostly 
from  two  breast  cancer  samples  thought  to  display  mutator 
phenotype  [Stephens  et  al.,  2005].  Since  this  pattern  is  similar  to 
the  purine  core  of  the  WKV nRRRnVWK  motif,  we  decided  to 
analyze  these  mutations  for  the  presence  of  the  THEMIS  motif:  we 
found  that  59  out  of  the  88  (67%)  breast  cancer  mutations  fit  the 


THEMIS  motif  with  one  allowed  mismatch.  In  contrast,  only  8  out 
of  the  71  prostate  cancer  mutations  that  we  report  here  occur  in  the 
GA  motif  (8.9/71;  i.e.,  12.5%  expected;  P  =  0.75).  The  GA  motif  is 
overrepresented  though  in  the  breast  cancer  mutation  dataset 
(Supplementary  Table  S7):  25  out  of  the  122  mutations  fit  the  GA 
motif,  compared  to  15.3  expected  (P  =  0.01).  Thus  both  breast  and 
prostate  cancer  mutations  occur  in  the  context  of  a  purine-rich  core 
motif,  but  in  breast  cancer  this  core  is  often  GA. 

Somatic  Mutations  in  the  pol  p,  pol  tj,  and  pol  k  Genes  in 
Prostate  Cancer  Tissue 

The  significant  presence  of  the  THEMIS  motif  at  the  sites  of 
somatic  rasl  mutations  induced  by  EP  DNA  repair  suggests  that 
such  EP  DNA  polymerases  may  be  involved  in  the  etiology  of  (at 
least  some)  of  the  somatic  mutations  that  fit  this  motif. 
Accordingly,  we  decided  to  sequence  selected  exons  from  each  of 
the  EP  DNA  polymerase  genes  pol  (3,  pol  q,  and  pohc  for 
somatic  mutations  in  our  prostate  cancer  tissues,  to  test  the 
hypothesis  that  prostate  cancer  tissue  bears  common  somatic 
mutations  in  these  genes.  This  preliminary  analysis  identified 
somatic  mutations  in  all  three  genes,  but  the  gene  with  most  (and 
more  prevalent)  mutations  was  pol  (3.  Therefore,  we  screened 
the  complete  coding  sequence  of  the  pol  |3  gene  in  these  patients. 
The  result  of  these  analyses  in  all  polymerase  genes  is  shown  in 
Table  3:  We  identified  27  somatic  mutations  in  these  26  samples, 
14  of  which  were  missense  substitutions,  nine  were  silent  or 
intronic  substitutions,  two  substitutions  changed  splice  acceptor 
(AG)  sites,  one  was  in  the  promoter  region,  and  one  was  in  the  5'- 
UTR.  The  G31438A  intronic  substitution  is  recurrent  in  Patients  5 
and  30  (Table  3).  The  P242R  missense  substitution  was  also 
present  in  the  constitutional  DNA  of  Patient  13,  but  with  altered 
prevalence  (40%  average  mutant  peak  in  the  tumor  tissue  [Table  3] 
compared  to  60%  average  mutant  peak  in  the  constitutional  DNA 
[data  not  shown]).  Thus  an  additional  somatic  event  may  have 
occurred  in  the  tumor  in  this  pol  (3  site.  Overall,  among  26 
patients  analyzed,  19  patients  (73%)  have  somatic  mutations  in  an 
EP  polymerase  gene  and  16  patients  (61%)  have  somatic 
mutations  in  pol  (3. 

Somatic  Instability  in  Prostate  Cancer  Tissue 

Since  the  HSD3B2  gene  was  most  commonly  “hit”  in  our 
sequencing  analysis  of  these  prostate  cancer  tissues,  we  decided  to 
search  for  other  kinds  of  genomic  instability  at  this  locus. 
Accordingly,  we  investigated  20  informative  patients  for  MSI  and 
loss  of  heterozygosity  (LOH)  in  the  complex  dinucleotide  repeat 
in  intron  3  of  the  HSD3B2  gene  [Devgan  et  al.,  1997],  by 
comparing  tumor  and  matched  normal  DNA.  The  results  show 
that  70%  of  the  patients  analyzed  have  LOH/  MSI  in  the  HSD3B2 
locus  (Supplementary  Table  Sla).  The  most  common  findings 
were  MSI,  manifested  as  follows:  five  cases  with  contraction  of  the 
alleles  of  the  tumor  DNA,  four  cases  with  expansion,  and  three 
cases  with  a  combination  of  both  contraction  and  expansion  (data 
not  shown).  Three  patients  had  LOH  in  the  tumor  tissue 
(Supplementary  Table  Sla). 

Discussion 

Somatic  Mutations  Are  Commonly  Found  in  Prostate 
Cancer  Tissue 

We  report  a  high  number  of  somatic  prostate  cancer  mutations 
in  all  three  “target”  genes  studied.  A  total  of  38  nucleotide 
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Table  3.  Somatic  Mutations  Detected  in  Polymerase  Genes  in  Prostate  Cancer  Patients* 


Gene  (accession  #) 

Patient 

DNA  change 

Type  of  change/location 

Mutant  peak  (%) 

Predicted/known  effect 

POLB  (AF491812.1) 

i 

g.23912G>  A 

AG-splice  junction 

59 

Deletion  of  amino  acids  184-185 

2 

g.25254G>  A 

E216K 

52 

4 

g.960C  >  T 

5-UTR 

52 

g,18145T>C 

N128N 

54 

5 

g.32481C>T 

P261L 

100 

Altered  fidelity 

g.32573A>G 

T292A 

100 

Altered  fidelity 

g.32592T>C 

I298T 

100 

g.32439G>Aa 

Intron  12 

79 

8 

g.ll628T>C 

Intron  3 

27 

10 

g.25314A>T 

M236L 

38 

Altered  fidelity 

11 

g.l5180G>  A 

E123K 

100 

12 

g,1444G>C 

K27N 

60 

13 

g.25236C>T 

L210L 

42 

g.31911C>Gb 

P242R 

40 

Altered  fidelity 

15 

g.921C>T 

Promoter 

100 

20 

g.25302G>  A 

E232K 

100 

23 

g.l2622A>G 

AG-splice  junction 

25 

Deletion  of  amino  acids  88-90 

24 

g,11630T>C 

Intron  3 

30 

29 

g.32521G>T 

G274G 

35 

30 

g.32467T>C 

Intron  12 

51 

g.32439G>Aa 

Intron  12 

100 

POLH  (AY388614.1) 

1 

g.28891G>  A 

G259R 

23 

6 

g.29021A>T 

Intron  7 

46 

20 

g.28901G>C 

G263A 

60 

XPV 

g.28967C>T 

S284F 

66 

POLK  (AY273797.1) 

12 

g.67088C>  A 

T205K 

28 

21 

g.66992A>G 

Intron  5 

68 

Changes  Lariat-A 

27 

g.67088C>T 

T205I 

30 

*For  both  mutations  that  change  the  invariant  splice  junction  (AG),  an  in-frame  AG  exists  shortly  downstream.  Utilization  of  the  alternative  AG  is  predicted  to  result  in  the 
deletions  shown  in  the  last  column.  XPV  denotes  a  pol  rj  residue  mutated  in  an  XPV  patient.  For  further  details,  see  Results. 

■"Creates  a  new  branch  (Lariat -A)  site. 
bAlso  in  constitutional  DNA  (rs3136797). 


alterations  were  detected  in  the  HSD3B2  gene  in  80%  of  the  26 
patients.  We  also  identified  16  somatic  SRD5A2  substitutions  in 
60%  of  the  patients  [Makridakis  et  ah,  2004]  and  17  somatic 
HPRT  substitutions  in  35%  of  the  patients  (Table  1).  Collectively, 
we  found  a  total  of  71  somatic  mutations  in  these  three  genes  in 
prostate  cancer.  The  high  rate  of  somatic  events  identified,  even  in 
the  HPRT  gene,  suggests  that  there  may  be  generalized  genomic 
instability,  at  least  in  some  of  these  tumors.  As  summarized  in 
Table  4,  four  patients  harbor  somatic  substitutions  in  all  three 
“target”  genes  examined,  and  most  of  these  substitutions  fit  the 
THEMIS  motif.  Significantly,  all  of  these  patients  have  additional 
somatic  mutations  in  an  EP  polymerase  gene  (Table  4). 

Nucleotide  Sequence  Context  of  the  Somatic  Substitutions 

Analysis  of  the  nucleotide  sequence  that  surrounds  each 
HSD3B2,  SRD5A2,  and  HPRT  gene  mutation  showed  that  79% 
of  the  71  somatic  mutation  sites  fit  the  THEMIS  motif,  with  up  to 
one  mismatch  (P  =  0.005;  Table  2).  The  prevalence  of  this  motif  at 
somatic  mutation  sites  led  us  to  hypothesize  that  it  is  common, 
perhaps  even  universal.  Review  of  the  published  literature 
identified  this  motif  (with  up  to  one  mismatch)  in  66%  of  all 
somatic  prostate  cancer  mutations  and  74%  of  all  somatic  breast 
cancer  mutations  (P  =  0.027  for  prostate;  P  =  0.046  for  breast 
cancer).  Moreover,  67%  of  the  somatic  breast  cancer  mutations 
thought  to  result  through  a  mutator  phenotype  in  the  protein 
kinase  gene  family  [Stephens  et  ah,  2005]  fit  the  THEMIS  motif 
with  one  allowed  mismatch.  Our  analyses,  therefore,  suggest  that 


most  somatic  mutations  in  the  most  common  human  malig¬ 
nancies  fit  the  THEMIS  motif  with  up  to  one  mismatch. 

Analysis  of  the  type  of  mutations  found  in  the  HSD3B2, 
SRD5A2,  and  HPRT  genes  showed  that  transitions  are  signifi¬ 
cantly  more  common  than  expected  (P<  0.019).  This  finding  was 
confirmed  among  the  other  somatic  prostate  cancer  mutations 
that  we  found  in  the  published  literature.  However,  analysis  of  the 
breast  cancer  mutations  obtained  from  the  literature  shows  that 
their  transition  frequency  is  not  significantly  higher  compared  to 
constitutional  substitutions.  Thus,  although  both  prostate  and 
breast  cancer  mutations  fit  the  THEMIS  motif,  only  in  prostate 
cancer  are  there  more  transitions  than  expected  by  chance. 
Moreover,  the  GA  motif  is  overrepresented  only  among  breast 
cancer  mutation  sites,  not  prostate  cancer.  These  data  suggest  that 
similar  yet  distinct  molecular  etiologies  exist  between  the 
generation  of  somatic  mutations  in  the  prostate  and  breast.  For 
example,  both  prostate  and  breast  cancer  mutations  may  be 
caused  by  EP  repair,  but  the  polymerase  or  the  carcinogen 
involved  may  be  different. 

The  THEMIS  motif  includes  two  “spacer  segments”  that  are  of 
variable  length,  from  0  to  2  nucleotides.  The  spacer  segments  are 
often  in  the  1-1  permutation  (i.e.,  one  nucleotide  gap  in  each  side 
of  the  mutation)  but  the  data  reported  here  is  only  statistically 
significant  when  the  spacer  is  variable  (0-2).  In  addition,  the 
degenerate  nature  of  the  motif  inevitably  has  the  effect  that  the 
same  mutation  site  often  fits  more  than  one  permutation  (e.g.,  both 
0-1  and  1-1).  This  fact  makes  it  difficult  to  calculate  exact 
probabilities  for  each  specific  permutation.  Variable  spacers  have 
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Table  4.  Patients  With  Somatic  Nucleotide  Substitutions  in 
All  "Target"  Genes  Analyzed  (HSD3B2,  SRD5A2,  and  HPRT 
Genes)1^ 


Patient  Gene3  DNA  change  Sequence  context3 


HPRT 

g.39835C>Tc 

gTG 

GGG 

TC 

CCT 

g.39897C>T 

AGA 

T 

GGt 

TA 

AAT 

g.39985T>C 

AGA 

TT 

tAA 

AAG 

g.40051C>Tc 

TGt 

CT 

GGA 

ATT 

SRD5A2 

g.888G>A 

TGC 

C 

AGc 

CcG 

g,1890G>A 

AcA 

AGG 

TGG 

CTT 

HSD3B2 

g.8774C>Tc 

TGG 

GT 

GGA 

GTT 

POLB 

g.23912G>  A 

ATC 

A 

cAG 

GTG 

POLH 

g.28891G>  A 

GTC 

TT 

GGA 

G 

GAA 

HPRT 

g.39835C>Tc 

gTG 

GGG 

TC 

CTT 

g.40051C>Tc 

TGt 

CT 

GGA 

ATT 

SRD5A2 

g.888G>  A 

TGC 

C 

AGc 

CcG 

g,1294T>Cc 

AGA 

AGG 

CAG 

HSD3B2 

g,1551A>G 

AGC 

AGG 

AgG 

g,1571G>  A 

TcA 

GAG 

GAT 

g.l622T>C 

TcC 

AAG 

GCC 

CTG 

g,1671A>G 

AGt 

AAA 

CTT 

POLH 

g.29021A>T 

TTT 

TT 

AAA 

ATC 

HPRT 

g.40055A>G 

TTC 

C 

AGA 

C 

AAG 

g.40073A>G 

gGA 

T 

AtG 

CC 

CTT 

g.40091A>G 

ATG 

AAt 

A 

CTT 

SRD5A2 

g.2019T>C 

CGC 

AGc 

CC 

AAG 

HSD3B2 

g.8089G>A 

AGA 

AGG 

CTG 

g.8174G>A 

TGG 

GGA 

AG 

GAG 

POLB 

g.l5180G>A 

cTA 

GAA 

G 

GTG 

HPRT 

g.39859G>A 

AGC 

C 

AGA 

CTG 

g.39939T>G 

AGC 

AAt 

TAT 

AAG 

g.40011T>C 

ATC 

TA 

AAt 

GAT 

SRD5A2 

g.888G>  A 

TGC 

C 

AGc 

CcG 

g,1914G>  A 

cTG 

GAG 

CC 

AAT 

g,1927C>Tc 

AcC 

GAG 

GA 

AAT 

HSD3B2 

g.8006A  >  G 

AGG 

AAA 

T 

CAT 

g.8577T>Cc 

AGG 

T 

GAA 

CAc 

POLK 

g.66992A>G 

ATT 

T 

AAA 

CTT 

'Refer  to  Table  1  for  legend.  Sequence  contexts  that  conform  to  the  THEMIS  motif 
are  shaded.  GenBank  sequences:  HPRT,  M26434.1;  SRD5A2,  L03843.1;  HSD3B2, 
M77144.1;  POIB,  AF491812.1;  POLH,  AY388614.1;  and  POLK,  AY273797.1. 
aPolymerase  mutations  are  in  bold. 


been  reported  before  at  motif  sites  in  nature.  For  example,  the 
spacer  between  the  -35  and  -10  consensus  sequences  of  the  Sigma 
A  protein  binding  site  varies  between  16  and  18  nucleotides 
[Helmann,  1995],  while  the  spacer  between  the  two  boxes  of  the 
Sigma  B  binding  site  is  13  to  15  nucleotides  long  [Petersohn  et  al., 
1999].  The  biological  mechanism  that  allows  this  spacer  variability 
is  unknown,  but  we  speculate  that  the  wide  and  flexible  active  site 
[Perlow-Poehnelt  et  ah,  2004]  of  the  EP  polymerases  may  be 
responsible:  it  may  accommodate  the  invariable  9  nucleotides  of  the 
THEMIS  motif  in  various  manners,  e.g.,  with  0-1,  1-1,  or  1-0 
spacers.  Y-family  polymerases  (such  as  pol  r|  and  pol  k)  actually 
have  two  partially  overlapping  active  sites  [Chandani  and  Loechler, 
2007],  a  finding  that  may  also  contribute  to  the  spacer  variability: 
the  two  active  sites  may  have  different  binding  preferences. 

EP  DNA  Polymerases:  A  Role  in  the  Origin  of  Somatic 
Mutations? 

The  existence  of  a  frequently  mutated  somatic  motif  suggests 
that  many  mutations  in  the  most  common  forms  of  human  cancer 
may  have  similar  molecular  etiology.  The  THEMIS  motif  is 


overrepresented  among  somatic  mutation  sites  induced  by  EP 
repair  in  the  SENCAR  mouse  skin  model  following  mutagenic 
treatment  (Supplementary  Table  S5  and  data  not  shown; 
P  =  0.01).  These  data  parallel  our  findings  in  prostate/breast 
cancer  tissue.  Thus,  we  propose  that  many  somatic  mutations  in 
many  types  of  human  cancer  (such  as  breast,  prostate,  skin,  and 
lung)  are  caused  by  EP  DNA  repair  following  carcinogenic 
exposure. 

To  test  this  model  we  decided  to  first  screen  the  EP  DNA 
polymerase  genes  POLB  (pol  |3),  POLH  (pol  rj ),  and  POLK  (pol 
k)  for  somatic  mutations  that  may  have  in  turn  caused  the  high 
number  of  motif  mutations  in  our  prostate  cancer  samples.  We 
report  a  total  of  27  somatic  mutations  in  the  three  EP  polymerase 
genes  in  our  26  samples  (Table  3;  only  the  pol  [3  gene  was  fully 
sequenced).  Four  somatic  pol  (3  mutations  have  been  previously 
reported  in  prostate  cancer,  but  that  study  was  conducted  with 
fewer  samples  (12  cases)  of  Asian  background  [Dobashi  et  al., 
1994].  Most  (52%)  of  the  somatic  mutations  reported  here  are 
missense,  while  two  change  AG  splice  junctions  and  another 
mutation  changes  a  lariat-adenine  to  guanine  (Table  3).  The  two 
splice-junction  mutations  are  predicted  to  result  in  deletions  of 
two  to  three  amino  acids,  at  the  minimum.  Most  (59%)  of  the  27 
somatic  mutations  are  prevalent  in  the  tumor,  suggesting  that  they 
may  be  “drivers.” 

X-ray  crystallography  studies  of  human  pol  (3  may  shed  a  light 
on  the  potential  role  of  the  mutated  residues:  proline-261  forms  a 
hydrogen  bond  with  glutamine-264  [Pelletier  et  al.,  1996],  while 
threonine-292  and  methionine-236  are  both  involved  in  template 
binding  [Sawaya  et  al.,  1997;  Bose-Basu  et  al.,  2004].  Disruption  of 
the  hydrogen  bond  between  residues  261  and  264  has  been 
proposed  to  result  in  the  mutator  phenotype  displayed  by  the 
previously  reported  prostate  cancer-associated  1260  M  pol  P 
mutation  [Dalai  et  al.,  2005].  Thus  the  P261L  mutation  may  also 
disrupt  this  hydrogen  bond  and  result  in  prostate  tumorigenesis. 
Moreover,  both  T292A  and  M236L  substitutions  are  predicted  to 
destroy  the  hydrogen  bond  between  the  respective  residues  and 
the  DNA  template,  affecting  DNA  synthesis  fidelity.  Methionine- 
236  and  proline-242  are  in  the  “flexible  loop,”  a  part  of  pol  p  that 
functions  to  position  the  primer  and  that  has  been  shown  to 
contain  several  residues  that  cause  a  mutator  phenotype  when 
mutated  [Dalai  et  al.,  2004].  The  P242R  mutant  resulted  in  similar 
activity  and  a  four-fold  lower  mutation  rate  than  the  normal  pol  P 
when  assayed  in  vitro  by  a  herpes  simplex  virus  thymidine  kinase 
forward  mutation  assay  [Hamid  and  Eckert,  2005].  Thus,  the 
decreased  prevalence  of  the  P242R  mutant  in  the  tumor  compared 
to  the  adjacent  normal  tissue  of  patient  13  (see  Results)  may 
translate  into  higher  mutagenicity  in  the  tumor.  Therefore,  several 
of  the  somatic  pol  P  mutations  that  we  report  here  may  cause  a 
mutator  phenotype  [Loeb  et  al.,  2003],  Future  functional  studies 
ought  to  examine  this  prediction. 

One  of  the  missense  pol  q  mutations  is  in  glycine-263  and  both 
missense  pol  k  mutations  are  in  threonine-205  (Table  3).  The 
homologous  residue  of  pol  k  threonine-205  is  pol  q  threonine- 
122  [Boudsocq  et  al.,  2002].  Missense  mutations  of  both  pol  q 
residues  122  and  263  were  reported  in  XPV  patients  [Broughton 
et  al.,  2002].  Thus  three  of  the  somatic  mutations  that  we 
identified  are  in  (or  are  in  residues  homologous  to)  pol  q  residues 
previously  associated  with  XPV.  This  finding  suggests  that  these 
mutations  may  play  a  role  in  carcinogenesis.  Yet  if  some  XPV 
patients  are  born  with  a  mutation  of  the  same  pol  q  codon  that  is 
associated  with  prostate  cancer  etiology,  then  why  do  these 
patients  get  only  skin  and  not  prostate  cancer?  A  potential 
explanation  is  different  environmental  exposure:  XPV  patients  are 
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Table  5.  Models  for  the  Potential  Etiology  of  the  Observed  Somatic  Mutation  Motifs 


DNA  sequence  motif/type 
of  mutation 

Potential  mutagen  involved  in 

DNA  damage 

Potential  polymerase  involved  in  DNA  damage  repair 
and/or  mutation  generation 

WKVnRRRnVWK 

RR  on  the  strand  that 
fits  the  motif 

PAH  [Meehan  et  al.,  1977] 

Cisplatin  [Masutani  et  al.,  2000] 
AAF-dG 

8-oxo-dG 

pol  p  [Venugopal  et  al.,  2005]/pol  q  [Ogi  et  al.,  2002]a 

pol  r\  [Masutani  et  al.,  2000] 
pol  r\  [Masutani  et  al.,  2000] 
pol  r\  [Haracska  et  al.,  2000] 

YY  on  the  other  strand 

UV  radiation  [Akiyama  et  al.,  1996] 

pol  r\  [Yu  et  al.,  2001] 

Transition 

Alkylating  agent  [Prakash  and 
Sherman,  1973] 

EP  polymerase  (e.g.,  pol  p  [Sobol  and  Wilson,  2001]) 

*These  models  involve  the  action  of  a  specific  environmental  mutagen  acting  in  conjunction  with  an  error-prone  polymerase  to  cause  specific  kinds  of  mutations  (or  at  specific 
sites).  Pol  p  is  not  a  “classic”  error  prone  polymerase,  yet  it  causes  67  times  more  substitution  errors  than  mammalian  pol  8  [Kunkel,  2003].  The  models  presented  in  this  table 
are  not  mutually  exclusive  (e.g.,  a  transition  may  also  occur  in  the  WKVnRRRnVWK  motif).  Bold  letters  indicate  the  potential  sites  of  the  altered  nucleotide.  For  further 
details  see  Discussion. 

aThe  choice  of  the  polymerase  involved  may  depend  on  the  type  of  the  PAH  adduct  that  is  present  (depurinating  or  stable). 

PAH,  polyaromatic  hydrocarbons  (e.g.,  benzo[ a] pyrene);  AAF-dG,  acetylaminofluorene-deoxyguanosine;  8-oxo-dG,  8-oxo-deoxyguanosine;  EP,  error-prone;  UV,  ultraviolet. 


inevitably  exposed  to  sunlight,  but  not  necessarily  to  prostate 
cancer-inducing  mutagens  (the  prostate  is  not  exposed  to  sunlight 
and  pol  r|  may  be  important  for  repairing  other  types  of  damage 
relevant  to  the  prostate;  Table  5).  Alternatively,  the  tumor  type 
specificity  may  result  from  the  exact  nature  of  the  mutation. 

Multiple  factors  determine  the  mutagenic  potential  of  DNA 
damage;  among  them,  the  choice  of  the  DNA  repair  machinery 
evoked  to  repair  the  lesions  [Pages  and  Fuchs,  2002]  and  the 
nucleotide  sequence  context  in  which  a  lesion  occurs  [Beard  et  al., 
2002;  Wei  et  al.,  1995].  We  identified  a  motif  around  the  sites  of 
human  prostate  and  breast  cancer  mutations  and  mouse  skin 
cancer  mutations  induced  by  EP  DNA  repair.  A  model  for  the 
generation  of  these  mutations  (Table  5)  may  involve  the  action  of 
environmental  mutagens  acting  in  conjunction  with  (mutant)  EP 
polymerase  genotypes  to  cause  specific  kinds  of  mutations  and/or 
at  specific  sites.  A  mutant  DNA  polymerase  may  result  in 
mutations  directly,  through  decreased  fidelity,  or  indirectly, 
through  the  use  of  a  “nonoptimal”  polymerase  for  each  DNA 
lesion  (as  in  the  XPV  paradigm;  Pages  and  Fuchs,  2002).  The 
presence  of  a  motif  around  the  mutation  sites  can  result  from 
either  the  mutagenic  tendencies  of  specific  polymerases  or  the 
binding  requirements  of  specific  mutagens.  The  type  of  EP 
polymerase(s)  involved  in  the  etiology  of  these  mutations  may  be 
inferred  from  the  known  specificities  of  these  enzymes  (Table  5),  if 
the  type  of  carcinogenic  exposure  is  known  for  each  patient. 

Other  mechanisms  leading  to  the  generation  of  the  motif 
mutations  are  also  possible,  such  as  aberrant  mismatch  repair  or 
BER.  In  fact,  one  of  the  human  polymerases  that  we  propose  to  be 
involved  in  this  process,  pol  (3  (Table  5),  is  essential  for  short- 
patch  BER  [Sancar  et  al.,  2004].  Pol  ]3-deficient  mouse  fibroblasts 
do  not  show  the  high  amounts  of  A:T  to  G:C  transitions  seen  in 
pol  (3-positive  cells  following  benzo[ a]  pyrene  treatment  [Venu- 
gopal  et  al.,  2005],  suggesting  that  pol  (3  is  responsible  for  most  of 
these  transitions.  Moreover,  the  fact  that  the  THEMIS  motif  is 
degenerate  may  indicate  the  presence  of  more  than  one 
mechanism  (e.g.,  more  than  one  polymerase  or  type  of  damage) 
with  distinct  sequence  requirements.  Alignment  of  more  than  one 
specific  sequence  motif  may  result  in  a  degenerate  motif. 

In  summary,  we  report  here  a  DNA  sequence  motif  commonly 
found  in  prostate  cancer  mutation  sites,  termed  THEMIS 
(WKVnRRRnVWK).  We  extended  this  finding  to  human  breast 
and  mouse  skin  cancer  and  suggest  that  the  THEMIS  motif  may 
be  the  result  of  EP  DNA  polymerase-induced  mutations  in  these 
tumors. 
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ABSTRACT:  DNA  polymerase  p  is  essential  for  short- 
patch  base  excision  repair.  We  have  previously  identified 
20  somatic  pol  p  mutations  in  prostate  tumors,  many  of 
them  missense.  In  the  current  article  we  describe  the 
effect  of  all  of  these  somatic  missense  pol  P  mutations 
(p.K27N,  p.E123K,  p.E232K,  p.P242R,  p.E216K, 
p.M236L,  and  the  triple  mutant  p.P261L/T292A/I298T) 
on  the  biochemical  properties  of  the  polymerase  in  vitro, 
following  bacterial  expression  and  purification  of  the 
respective  enzymatic  variants.  We  report  that  all  missense 
somatic  pol  p  mutations  significantly  affect  enzyme 
function.  Two  of  the  pol  p  variants  reduce  catalytic 
efficiency,  while  the  remaining  five  missense  mutations 
alter  the  fidelity  of  DNA  synthesis.  Thus,  we  conclude 
that  a  significant  proportion  (9  out  of  26;  35%)  of 
prostate  cancer  patients  have  functionally  important 
somatic  mutations  of  pol  p.  Many  of  these  missense 
mutations  are  clonal  in  the  tumors,  and/or  are  associated 
with  loss  of  heterozygosity  and  microsatellite  instability. 
These  results  suggest  that  interfering  with  normal 
polymerase  p  function  may  be  a  frequent  mechanism  of 
prostate  tumor  progression.  Furthermore,  the  availability 
of  detailed  structural  information  for  pol  p  allows 
understanding  of  the  potential  mechanistic  effects  of 
these  mutants  on  polymerase  function. 

Hum  Mutat  32:415-423,  2011.  ©  2011  Wiley-Liss,  Inc. 

KEY  WORDS:  somatic;  expression  analysis;  kinetic; 
POLB;  mutation 


Introduction 

Human  DNA  polymerase  P  (pol  P)  is  a  monomeric  protein  of  335 
residues,  which  is  essential  for  short-patch  base  excision  repair 
[Goodman,  2002].  Base  excision  repair  (BER)  is  one  of  the  major 
pathways  of  DNA  repair  that  removes  oxidized  and  alkylated  bases 
from  DNA  [Friedberg,  2003].  Pol  P  is  also  involved  in  meiotic 
recombination  [Kidane  et  al.,  2010]  and  repair  of  double-stranded 
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DNA  breaks  through  the  process  of  nonhomologous  end  joining 
[Wilson  and  Lieber,  1999].  Targeted  disruption  of  pol  p  in  mice 
resulted  in  neonatal  lethality  (due  to  respiratory  failure)  growth 
retardation  and  apoptotic  cell  death  in  the  developing  nervous  system 
[Sugo  et  al.,  2000],  suggesting  a  role  for  pol  P  in  neurogenesis. 

The  small  size  and  monomeric  nature  of  DNA  pol  P  make  it  an 
attractive  candidate  for  biochemical  and  kinetic  analysis.  More¬ 
over,  the  availability  of  a  high  resolution  crystal  structure  of  pol  P 
makes  it  easier  to  identify  potential  functionally  important 
residues  for  mechanistic  studies.  In  addition,  pol  p  shares  many 
structural  and  mechanistic  features  with  other  DNA  polymerases 
of  known  structure:  for  example,  the  mechanism  of  DNA 
polymerization  follows  an  ordered  binding  of  substrates,  with 
the  DNA  template  binding  first  [Beard  and  Wilson,  1998],  These 
attributes  make  pol  P  a  good  candidate  for  biochemical  analysis. 

Pol  P  expression  is  increased  in  many  cancer  cells  [Scanlon 
et  al.,  1989]  and  overexpression  of  pol  P  results  in  aneuploidy  and 
tumorigenesis  in  nude  immunodeficient  mice  [Bergoglio  et  al., 

2002] .  Moreover,  pol  p  heterozygous  (  +  )  mice  exhibit  increased 
single-stranded  DNA  breaks,  chromosomal  aberrations,  and 
mutagenicity  compared  to  normal  animals  [Cabelof  et  al., 

2003] .  Thus,  both  higher  and  lower  pol  P  activity  can  result  in 
increased  P  mutagenesis  in  vivo.  Furthermore,  several  missense 
substitutions  of  pol  P  have  also  been  shown  to  act  as  mutator 
mutants  both  in  vivo  and  in  vitro  [Maitra  et  al.,  2002;  Shah  et  al., 
2001], 

Human  DNA  pol  P  is  ubiquitously  expressed,  and  the  POLB 
gene  (MIMlt  174760)  is  located  at  chromosome  band  8pll,  a 
chromosomal  region  known  to  be  lost  during  prostate  cancer 
progression  [Visakorpi  et  al.,  1995].  Somatic  pol  P  mutations  were 
initially  found  in  2  out  of  12  Japanese  prostate  cancer  patients 
[Dobashi  et  al.,  1994],  but  that  analysis  was  done  by  single-strand 
conformation  polymorphism,  a  technique  that  can  miss  muta¬ 
tions  [Pearce  et  al.,  2008]. 

We  recently  sequenced  the  complete  coding  region  of  the  POLB 
gene  for  somatic  mutations  in  26  prostate  cancer  tissues.  We 
identified  20  somatic  mutations  in  these  prostate  tumors,  nine  of 
them  missense  [Makridakis  et  al.,  2009].  With  the  exception  of 
g.31911C>G  (p.P242R),  which  substitutes  the  normal  proline 
residue  at  position  242  with  arginine,  these  substitutions  were 
absent  in  lymphocyte  DNA  from  the  same  patient.  Many  of  the 
somatic  mutations  identified  were  prevalent  in  the  tumors  (i.e., 
they  were  present  in  more  than  half  of  the  tumor  chromosomes) 
[Makridakis  et  al.,  2009],  suggesting  that  they  play  an  important 
role  in  tumor  progression.  Overall,  61%  of  the  prostate  cancer 
patients  had  somatic  substitutions  in  pol  P  [Makridakis  et  al., 
2009], 
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Molecular  epidemiologic  studies  have  shown  that  the  p.P242R 
pol  (3  substitution  is  significantly  associated  with  decreased  risk 
for  colorectal  cancer  [Moreno  et  al.,  2006].  However,  the  same 
substitution  has  also  been  associated  with  poorer  lung  cancer 
prognosis  [Matakidou  et  al.,  2007],  Functional  biochemical 
studies  may  explain  this  discrepancy.  In  vitro,  the  p.P242R  pol  (3 
substitution  behaves  as  an  antimutator  [Hamid  and  Eckert,  2005], 
consistent  with  the  colorectal  cancer  data. 

Structural  analyses  may  provide  useful  hints  about  the  potential 
importance  of  some  of  the  somatic  prostate  cancer  mutations  that 
we  identified:  nuclear  magnetic  resonance  studies  of  [methyl-13C] 
methionine-labeled  pol  (3  have  shown  that  methionine-236  of  pol  [3 
interacts  with  a  single-nucleotide  gapped  DNA  template  (the  natural 
pol  p  substrate),  and  that  this  interaction  is  essential  for  pol  p 
conformational  activation  [Bose-Basu  et  al.,  2004].  Both  methio- 
nine-236  and  threonine-292  are  involved  in  DNA  template  binding 
[Bose-Basu  et  al.,  2004;  Sawaya  et  al.,  1997]  and  thus  may  be 
important  for  function.  Both  the  g.25314A4T  (p.M236L)  and  the 
g.32573A4G  (p.T292A)  substitutions  (found  in  prostate  cancer 
patients)  [Makridakis  et  al.,  2009]  are  predicted  to  destroy  the 
hydrogen  bond  between  these  pol  P  residues  and  the  DNA  template 
[Sawaya  et  al.,  1997].  Thus,  both  of  these  mutations  may  result  in 
loss  of  activity  and/or  altered  fidelity  of  DNA  synthesis.  Finally, 
proline-242  is  in  the  “flexible  loop,”  a  part  of  pol  P  that  functions  to 
position  the  primer  and  has  been  shown  to  contain  several  residues 
that  cause  a  mutator  phenotype  when  mutated  [Dalai  et  al.,  2004]. 

Even  though  we  identified  many  somatic  pol  P  mutations  in 
prostate  cancer,  we  do  not  know  the  functional  effect  of  these 
mutations.  In  order  to  directly  assess  the  potential  role  of  the 
identified  pol  P  mutations  in  prostate  cancer  progression,  we 
biochemically  characterized  the  effect  of  all  the  missense  prostate 
cancer  variants  on  both  polymerase  activity  and  fidelity  of  DNA 
replication,  using  mostly  steady-state  kinetic  analysis.  The  data, 
presented  here,  demonstrate  that  these  somatic  pol  p  mutations 
may  be  important  contributors  to  prostate  cancer  progression. 

Furthermore,  systematic  biochemical  characterization  of  the 
prostate  cancer  associated  missense  mutations  of  pol  P,  a 
monomeric  polymerase  highly  suitable  for  mechanistic  studies 
[Beard  et  al.,  2002],  provides  valuable  information  on  the 
molecular  determinants  of  both  polymerase  activity  and  fidelity. 
In  addition,  the  availability  of  detailed  structural  information  for 
pol  P  allows  structural-functional  comparisons  for  all  functionally 
important  residues. 

Materials  and  Methods 

Nomenclature 

The  GenBank  reference  sequence  accession  number  used  for  the 
genomic  POLB  sequence  is  AF491812.1. 

Bacterial  Strains  and  Growth  Conditions 

The  Escherichia  coli  strain  BL21  DE3  was  used  for  pol  p  protein 
expression.  E.  coli  DH5a,  BL21  (DE3),  and  recombinant  E.  coli 
harboring  human  pol  P  variants  were  cultured  in  LB  medium 
containing  25  pg/ml  kanamycin  (Sigma- Aldrich,  St.  Louis,  MO) 
when  appropriate. 

Construction  of  Variants  of  pol  p 

The  distinct  pol  P  mutants  were  obtained  by  site-directed 
mutagenesis  using  the  Quick-change  kit  (Stratagene,  La  Jolla,  CA) 


according  to  the  protocol  of  the  manufacturer  using  the  pET28a(  +  )- 
WT  (wild-type)  bacterial  expression  vector  as  a  template  (a  gift  of  Dr. 
Joann  Sweasy  from  Yale  University).  Successful  mutagenesis  was 
confirmed  by  DNA  sequencing  with  BigDye  chemistry  on  a  3100  ABI 
sequencer  (Perkin- Elmer,  Waltham,  MA). 

Expression  and  Purification  of  Mutant  Enzymes 

Purification  of  pol  P  proteins  was  performed  as  previously 
described  [An  et  al.,  2004;  Kosa  and  Sweasy,  1999]  with  the 
following  modifications.  Each  pol  P  variant  was  expressed  as  a 
fusion  protein  with  a  six-residue  poly-histidine  tag  at  the  N 
terminus.  The  enzymes  were  purified  using  HisTrap  FF  crude  Kit 
(GE  Healthcare,  Piscataway,  NJ)  according  to  the  manufacturer 
instructions.  The  fusion  proteins  were  expressed  in  BL21  DE3  cells, 
which  were  grown  at  37°C  to  mid-log  phase  and  then  induced  3  to 
6  hr  with  1  mM  IPTG.  Cells  were  harvested  by  centrifugation, 
resuspended  in  40  mM  Tris  pH  8,  500  mM  NaCl,  lOmM  imidazole, 
and  100  pi  Protease  Inhibitor  Cocktail  (Sigma- Aldrich)  and  lysed  by 
sonication.  Extracts  were  cleared  by  centrifugation  ( 15,000  rpm, 
15  min  at  4°C),  and  then  loaded  onto  HisTrap  FF  crude  Kit/ 100  ml 
of  starting  culture.  The  proteins  were  eluted  with  500  mM  imidazole 
in  0.5  M  NaCl.  Eluted  proteins  were  then  loaded  onto  a  HiTrap  SP 
HP  column  (GE  Healthcare).  The  column  was  washed  with  100  mM 
NaCl  and  proteins  were  eluted  with  2  M  NaCl  and  stored  at  —  80°C 
in  50  mM  Tris  pH  8,  1  mM  EDTA,  2  M  NaCl,  10%  glycerol,  and 
protease  inhibitors  as  above.  Purified  proteins  were  run  on  a 
Coomassie  Blue-stained  SDS-PAGE  gel  to  assess  purity.  Protein 
levels  were  quantified  by  Bradford  protein  assay  (Sigma- Aldrich). 

Western  Bloting 

Pol  [3  proteins  were  identified  by  Western  blot  [Servant  et  al., 
2002].  Proteins  were  electrophoresed  in  a  12%  SDS-PAGE  gel  and 
transfer  to  polyvinylidenedifluoride  membrane  (Thermo  Scien¬ 
tific,  Waltham,  MA).  Blots  were  blocked  by  5%  nonfat  dry  milk  in 
Tris-buffered  saline-Tween  20  (0.1%  Tween)  and  incubated  with 
anti-His  tag  antibody  (Sigma-Aldrich)  according  to  the  protocol 
of  the  manufacturer.  For  detection  we  were  used  IR  Dye  800CW 
Goat  Anti-Rabbit  IgG  (LI-COR  Biosciences,  Lincoln,  NE)  and  the 
Odyssey  apparatus  (LI-COR  Biosciences). 

Assay  of  DNA  Polymerase  Activity 

DNA  polymerase  activity  assay  was  performed  by  incorporation 
of  [a-32P]dATP  (Perkin-Elmer)  as  previously  described  [Maitra 
et  al.,  2002]  with  the  following  modifications.  The  final  reaction 
mixture  was  50  mM  Tris  buffer  pH  8.0,  20  mM  MgCl2,  100  mM 
NaCl,  200  pg/ml  bovine  serum  albumin  (BSA),  200  pM  dithio- 
threitol,  20  pM  dATP,  100  pM  each  of  the  three  remaining  dNTPs, 
2pCi  of  [a-32P]dATP,  and  10  pg  activated  calf  thymus  DNA. 
Reactions  were  incubated  at  37°C  for  30  min  and  stopped  with 
EDTA.  The  reaction  mixture  were  spotted  on  GFA  filters 
(Whatman,  Piscataway,  NJ),  which  were  washed  twice  with 
22.5  mg/ml  NaPP;,  8.5%  concentrated  perchloric  acid  and  twice 
with  22.5  mg/ml  NaPP;,  8%  concentrated  hydrochloric  acid,  and 
then  washed  in  ethanol.  The  filters  were  dried  and  radioactivity 
was  counted  on  a  scintillation  counter. 

DNA  Substrate 

All  oligonucleotides  were  synthesized  and  high-pressure  liquid 
chromatography-purified  by  Integrated  DNA  Technologies 
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(Coralville,  IA).  A  20-mer  (5'-GCA  GGA  AAG  CGA  GGG  TAT 
CC-3'),  a  46-mer  (5'-TAT  GGT  ACG  CTG  GAC  TTT  GTG  GGA 
TAC  CCT  CGC  TTT  CCT  GCT  CCT  G-3'),  and  a  20-mer  (5'-ACA 
AAG  TCC  AGC  GTA  CCA  TA-3')  oligonucleotides  were  used  as 
primer,  template  and  downstream  oligonucleotide,  respectively 
[Chagovetz  et  al.,  1997].  5'-32P  end  labeling  of  the  20-mer  primer 
was  performed  with  3,000  Ci/mmol  [y-32P]ATP  (Perkin-Elmer) 
using  T4  polynucleotide  kinase  (U.S.  Biochemical  Corp., 
Cleveland,  OH)  according  to  the  manufacturer’s  protocol.  The 
32P-labeled  primer  was  then  purified  from  unincorporated  label 
by  a  Microspin™  G-50  (GE  Healthcare)  column.  The  down¬ 
stream  oligonucleotide  was  5'-end  phosphorylated  by  Integrated 
DNA  Technologies.  The  oligonucleotides  were  annealed  at  a 
primer:  template:  downstream  oligonucleotide  molar  ratio  of 
1:1. 2:1. 3  in  50  mM  Tris,  pH  8.0,  250  mM  NaCl,  in  order  to  create  a 
single  nucleotide  gap.  The  mixture  was  incubated  sequentially  at 
95°C  for  5  min,  slow  cooled  to  50°C  for  30  min,  and  50°C  for 
20  min  and  then  immediately  transferred  to  ice.  Annealing  of 
primer  was  confirmed  on  an  18%  polyacrylamide  (acrylamide/ 
bis-acrylamide:  29:1)  native  gel  followed  by  autoradiography  as 
described  [Li  et  al.,  1999;  Maitra  et  al.,  2002]. 

Steady-State  Incorporation  Experiments 

All  incorporation  reactions  (20  pi)  were  performed  in  50  mM 
Tris-Cl  (pH  8.0),  10  mM  MgCl2,  2  mM  DTT,  20  mM  NaCl,  0.2  mg/ml 
BSA,  2.5%  glycerol  and  contained  50  nM  annealed  DNA  substrate 
(see  above).  Correct  incorporation  (activity)  reactions  contained 
2.5  nM  purified  pol  P  enzymes  except  for  the  triple  mutant 
(600  nM),  while  misincorporation  (fidelity)  reactions  contained 
2.5  nM  to  300  nM  purified  pol  (3  enzymes.  All  concentrations 
given  refer  to  the  final  concentrations  after  mixing.  All  reactions 
were  performed  by  first  preincubating  the  DNA  substrate  with  pol 
(3  for  3  min  at  37°C  without  the  dNTPs.  Reactions  were  initiated 
by  the  addition  of  a  single  dNTP  (0.1-1400  pM).  After  2  min 
incubation  at  37°C,  the  reactions  were  quenched  by  adding  20  pi 
of  formamide  loading  buffer  (formed  by  mixing  900  pi  forma- 
mide,  22.2  pi  0.5  M  EDTA  [pH  8.0],  and  77.8  pi  water)  and  then 
boiled  for  10  min,  and  immediately  transferred  into  ice.  Products 
were  resolved  on  a  15%  polyacrylamide  (acrylamide/bis-acryla- 
mide:  29:1)  gel  containing  7M  urea  and  then  quantified  by  a 
Typhoon  Trio+  Variable  Mode  Phosphorlmager  (GE  Healthcare). 
To  ensure  that  all  reactions  were  conducted  at  steady  state  the 
enzyme  concentrations  were  optimized  using  time  course 
experiments  [Chagovetz  et  al.,  1997]. 

Kinetic  Analysis 

Kinetic  data  analyses  were  based  on  Lineweaver-Burk  plots.  We 
determined  the  values  of  fccat  and  fCm>dNTP  from  trendline 
equations  calculated  from  these  plots  by  the  Microsoft  Excel 
software  (Microsoft,  Redmond,  WA),  using  values  obtained  from 
the  plotted  kinetic  data.  The  fcCat/-K)n,dNTP  values  of  mispairs  were 
determined  from  dividing  fccat  and  Km  values  obtained  by  Excel 
analysis  of  the  inverse  plots.  Fidelity  values  for  each  dNTP 
were  calculated  using  the  following  equation:  Fidelity  = 

[(W  ^m.dNTP  (correct  T  ( %at/^m, dNTP  (incorrect] /(^-cat  /  ^m.dNTp)  incorrect 

[Li  et  al.,  1999].  Representative  plots  are  shown  in  Supp.  Figure  SI. 

Assay  of  DNA  pol  p  Lyase  Activity 

DNA  poly  [3  lyase  activity  assay  was  performed  as  in  Prasad 
et  al.  [1998],  with  the  following  modifications.  Briefly,  the  HPLC 


purified  DNA  oligonucleotide  (Integrated  DNA  Technologies) 
49_lyase,  was  3'  end  labeled  with  [a-32P]dATP  (Perkin-Elmer)  by 
Terminal  Transferase  (New  England  Biolabs  Inc.,  Ipswich,  MA). 
The  sequence  of  49_lyase  is  5'ACTACAAATTAGAAAATAGCUG- 
TCCTTGACGGCTAGAATTACCTACCGG3',  which  contains  a 
uracil  at  position  21  (underlined).  The  50-pl  tailing  reaction 
includes  50  mM  potassium  acetate,  20  mM  Tris-acetate  (pH  7.9), 
10  mM  magnesium  acetate,  0.25  mM  CoCl2,  10pCi  of 
[a-32P]dATP  and  5pmol  of  substrate  DNA.  The  reaction  was 
incubated  at  37°C  for  30  min.  After  incubation,  the  labeled 
oligonucleotide  was  purified  by  Micro  Bio-Spin  chromatography 
Columns  (Bio-Rad,  Hercules,  CA).  The  complimentary  strand  of 
49_lyase  is  RC49_lyase,  whose  sequence  is  5'CCGGTAGGTAA- 
TTCTAGCCGTCAAGGACAGCTATTTTCTAATTTGTAGT3',  and 
has  dATP  at  the  uracil  corresponding  position.  Following 
annealing  of  the  forward  and  reverse  strands,  the  AP  site  was 
created  by  removal  of  the  uracil  with  Uracil  DNA  Glycosylase 
(UDG).  The  UDG  reaction  included  20 mM  Tris-HCl  pH  8.0, 
1  mM  EDTA,  1  mM  DTT,  18nM  cold  annealing  substrate, 
3xl04cpm  [a-32P]  dATP-labeled  annealing  substrate  and 
10  units  of  UDG  (New  England  Biolabs  Inc.)  in  a  300-pl  final 
volume.  The  UDG  reaction  was  performed  at  37°C  for  30  min 
then  stopped  by  phenol/chloroform  extraction.  Purified  pol  (3 
variants  were  tested  for  their  lyase  activity.  The  reaction  included 
50  mM  Hepes  pH7.4,  2  mM  DTT,  5  mM  MgCl2,  with  or  without 
0.2  units  of  apurinic  endonuclease  1  (APE  I)  (New  England 
Biolabs  Inc.).  Final  volume  was  10  pi.  The  reaction  was  incubated 
at  37°C  for  30  min  and  stopped  by  addition  of  10  pi  2x 
formamide  loading  buffer,  which  contains  20  mM  EDTA  and  95% 
formamide.  Each  assay  was  done  in  triplicate.  After  denaturing  at 
75°C  for  2  min,  10  pi  sample  was  loaded  on  to  15%  polyacryla¬ 
mide  gel  containing  7  M  urea.  Electrophoresis  was  done  at  75  W 
for  80  min  in  1  x  TBE  buffer  by  Sequi-Gen  GT  Electrophoresis 
Cell  (Bio-Rad,  Hercules,  CA).  Gel  drying  and  autoradiography 
were  done  as  usual.  Bands  captured  and  quantified  with  software 
Image  Quant  5.1.  The  lyase  activity  was  calculated  following 
kinetic  analysis,  as  described  above. 


Results 

Expression  Constructs  of  Variants  of  DNA  pol  p 

Somatic  variants  of  DNA  pol  (3  originally  found  in  prostate 
cancer  tissue  [Makridakis  et  al.,  2009]  were  obtained  by  site- 
directed  mutagenesis  of  the  human  pol  |3  cDNA,  as  described  in 
Experimental  Procedures.  Both  normal  (WT)  and  variants  of  pol 
|3  were  expressed  in  E.  coli  and  purified  using  column  filtration 
techniques,  as  described  in  Experimental  Procedures.  After 
purification,  WT  and  variants  of  DNA  pol  P  were  analyzed  by 
SDS-PAGE.  This  analysis  showed  that  we  obtained  >90%  pure 
versions  of  pol  P  for  all  variants  except  the  triple  mutant  (for 
which  we  obtained  51%  homogeneity;  data  not  shown).  Figure  1 
displays  WT  pol  P  protein  analyzed  on  a  coomassie  blue-stained 
SDS-polyacrylamide  gel  and  detected  by  Western  blot. 


Variants  of  DNA  pol  p  Have  Polymerase  Activity 

We  tested  the  polymerase  activity  of  WT  and  variants  of  human 
DNA  pol  P  by  incorporating  dNTPs  into  activated  calf  thymus 
DNA.  These  experiments  showed  that  all  variants  of  DNA  pol  P 
were  active  (data  not  shown). 
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Figure  1.  Identification  of  purified  pol  p  protein  and  gel  assay  of 
pol  p  single  nucleotide  insertion.  A:  Electrophoretic  analysis  of  the 
expressed  pol  p  protein  at  various  stages  of  purification.  Separation 
was  performed  on  a  12.5%  (W/V)  SDS-polyacrylamide  gel.  M,  marker; 
lane  1,  crude  extract  from  BL21  (DE3)  £  coli cells  containing  pET-28a(+)/ 
pol  P;  lane  2,  crude  extract  from  IPTG-induced  BL21  (DE3)  £.  coli  cells 
containing  pET-28a(+)/  pol  P;  lane  3,  purified  pol  p.  B:  Detection  of 
purified  protein  by  western  bolt.  M,  marker;  lane  1,  purified  wild-type  pol 
p.  C:  A  gapped  oligo  substrate  was  incubated  with  wild-type  pol  p  in 
increasing  concentrations  of  a  single  dNTP  as  described  under 
"Experimental  Procedures."  Pol  p  concentrations  was  adjusted  to 
optimize  detection  of  primer  extension  products.  dATP  misincorporation 
resulted  in  a  single  nucleotide  extension  of  the  misinserted  dATP 
(correct  extension  against  the  next  nucleotide  [dT]  of  the  DNA 
substrate)  but  only  at  higher  dATP  concentrations. 

Effect  of  DNA  pol  (5  Variants  on  Catalytic  Efficiency 
(Correct  Incorporation) 

To  perform  pol  P  kinetic  analysis  we  employed  a  DNA  substrate 
containing  a  single-nucleotide  gap  that  was  previously  used  for 
fidelity  studies  [Chagovetz  et  al.,  1997;  Roberts  and  Kunkel,  1996]. 
We  determined  pol  P  catalytic  efficiency  based  on  steady-state 
kinetic  analyses  of  single-nucleotide  addition  (dCTP)  opposite 
template  dG  on  our  single  gapped  DNA  substrate.  The  catalytic 
activities  (fccat)  of  both  the  WT  and  genetic  variants  of  DNA  pol  P 
are  listed  in  Table  1.  This  table  shows  that  the  catalytic  activity  of 
the  g.l5180G4A  (p.E123K)  variant  was  increased  63%  compared 
to  WT.  Variants  g.25302G4A  (p.E232K),  p.M236L,  p.P242R, 
g.25254G4A  (p.E216K),  and  g.l444G4C  (p.K27N)  exhibit  similar 
catalytic  activity  to  that  of  WT  pol  p. 

In  comparing  .JCm,dCTP  between  the  specific  variants  and  WT 
pol  P  (Table  1),  we  found  that  variants  p.K27N  and  p.M236L 
increased  the  £Cm>dcTP  72%  and  56%,  respectively.  Thus,  these 
variants  of  pol  P  have  decreased  affinity  for  the  substrate.  Other 
variants  of  pol  P  except  the  triple  mutant  (g.32481C4T  (p.P261L)/ 
p.T292A/g.32592T4C  (p.I298T);  Makridakis  et  al.  [2009])  showed 
similar  K„,t jctp  to  that  of  WT  pol  p. 

The  best  measure  for  assessing  the  potential  effect  of  nucleotide 
variations  on  enzymatic  activity  in  vivo  is  assaying  catalytic 
efficiency.  The  catalytic  efficiency  (fccat/^m.dCTp)  of  the  p.K27N 
variant  was  decreased  42%  compared  to  WT  pol  P  in  our  assays. 


Table  1.  Steady-State  Kinetic  Parameters  for  Correct 
Incorporation  into  a  Gapped  Oligo  Substrate  by  Wild-Type  pol  p 
and  Variants3 


Enzyme 

K at  (s-*)b 

^m,dCTP  (pM) 

fccat/^m.dCTP  (s  1  M  *) 

WT 

8.09  (±12.06)  x  10'2 

0.25  (±10.08) 

3.24  (±10.92)  x  105 

p.K27N 

8.06  (±11.10)  x  10'2 

0.43  (±10.05) 

1.87  (±10.13)  x  105 

p.E123K 

1.32  (±10.11)  x  10^‘ 

0.31  (±10.01) 

4.24  (±10.20)  x  105 

p.E232K 

1.09  (±10.18)  x  10“' 

0.21  (±10.02) 

5.20  (±11.35)  x  105 

p.P242R 

1.03  (±10.21)  x  10~‘ 

0.27  (±10.02) 

3.81  (±0.98)  x  105 

p.E216K 

8.62  (±11.63)  x  10'2 

0.24  (±10.09) 

3.59  (±1.59)  x  105 

p.M236L 

1.06  (±10.24)  x  10~‘ 

0.39  (±10.01) 

2.72  (±10.57)  x  105 

Triple  mutant 

4.03  (±10.27)  x  10"4 

3.72  (±10.17) 

1.08  (±10.08)  x  102 

aThe  results  represent  the  mean  of  at  least  three  independent  determinations  + 
standard  error.  Triple  mutant  =  p.P261L/p.T292A/p.I298T. 

Calculated  using  total  protein  concentration. 

WT,  wild-type. 


WT  K27N  E123K  E232K  P242R  E216K  M2361  Triple 


Figure  2.  Influence  of  pol  p  variants  on  catalytic  efficiency  for 
dCTP  insertion.  WT  and  mutant  variants  were  assayed  on  a  single¬ 
nucleotide  gapped  DNA  substrate  with  a  templating  dG,  and  the 
catalytic  efficiency  (kcM/KmdmP)  was  determined  from  dividing  £cat 
and  Km  values  obtained  by  Excel  analysis  of  the  inverse  plots.  These 
data  represent  the  mean  of  at  least  three  independent  determinations, 
+  standard  error. 

The  remaining  pol  P  variants,  again  with  the  exception  of  the 
triple  mutant,  displayed  similar  catalytic  efficiency  to  that  of  WT 
pol  P  (Table  1  and  Fig.  2). 

Interestingly,  the  triple  mutant  pol  P  variant  was  significantly 
different  than  WT,  or  even  the  other  variants  (Table  1).  The  triple 
mutant  variant  showed  a  99.5%  decrease  in  catalytic  activity 
compared  with  WT  (Table  1).  The  Klri: acTP  of  the  triple  mutant 
was  15-fold  higher  than  WT  pol  P  and  the  catalytic  efficiency 
£cat/Jfm,dCTP  of  the  triple  mutant  was  thus  dramatically  decreased 
(Table  1  and  Fig.  2)  orders  of  magnitude  lower  than  WT. 

Effect  of  DNA  pol  |5  Variants  on  Fidelity  (Misincorporation) 

Misincorporation  studies  were  performed  in  odrer  to  under¬ 
stand  the  role  of  pol  P  variants  on  DNA  synthesis  fidelity.  These 
experiments  were  performed  with  enzyme  concentrations  that 
were  120-fold  higher  than  correct  incorporation  experiments  for 
both  variants  and  WT.  Figure  1C  shows  the  results  obtained  for 
both  correct  and  incorrect  incorporation  (misincorporation) 
experiments  for  the  WT,  with  all  four  dNTPs. 

Due  to  the  low  inherent  activity  of  the  triple  pol  P  mutant  we 
could  not  measure  the  catalytic  activity  for  misincorporation  even 
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though  we  used  more  triple  mutant  enzyme  and  dNTPs  (data  not 
shown).  The  fccat,  ICn.dNTP.  and  catalytic  efficiency  (fcCat/I'fm,dCTp) 
values  for  misincorporation  are  reported  in  Table  2  for  all  variants. 
Percentage  changes  of  the  fccat,  Km  ^r p,  and  catalytic  efficiency 
(^cat/^m.dCTp)  were  obtained  from  Table  2  and  listed  on  Table  3. 
Variants  p.K27N  and  p.E123K  showed  similar  fccat  to  that  of  WT 
enzyme  for  dTTP,  dATP,  and  dGTP  misincorporations  (Table  3). 
Variants  p.P242R  and  p.E216K  displayed  decreased  kcat  for  all 
dNTP  misincorporations  (Table  3).  The  p.E232K  variant  had 
similar  fccat  to  WT  enzyme  for  dATP  and  decreased  fccat  for  dTTP 
and  dATP  misincorporations  (Table  3).  The  p.M236L  variant 
showed  similar  A:cat  for  dTTP  and  dATP  with  the  WT  enzyme,  and 
increased  kcM  for  dGTP  misincorporation  compared  to  WT 
(Table  3). 

The  Km  for  dTTP  misincorporation  was  increased  for  variants 
p.K27N,  p.E216K,  p.E232K,  and  p.P242R  (Table  3).  The  Km  of 
both  p.E232K  and  p.P242R  variants  was  increased  for  both  dTTP 
and  dATP  misincorporations,  but  the  Km  for  dGTP  misincorpora¬ 
tion  was  decreased  only  for  the  p.E232K  variant,  whereas  the 
p.P242R  variant  showed  Km  similar  to  WT  (Table  3).  The  p.E123K 
and  p.M216L  variants  showed  WT  Km  for  dTTP  and  dGTP 
misincorporations  and  decreased  Km  for  dATP  misincorporations 
(Table  3). 

In  compared  catalytic  efficiencies  (fcCat/^m,dNTp)  for  misincor¬ 
poration,  variants  p.K27N,  p.P242R,  and  p.E216K  showed 
decreased  efficiency  for  all  misincorporations  (Table  3).  The 
p.E232K  variant  had  decreased  catalytic  efficiency  for  both  dTTP 
and  dATP  misincorporations  but  showed  WT  efficiency  for  dGTP 
misincorporation  (Table  3).  The  fccat/^m.dNTP  of  p.E123K  and 
p.M236L  variants  was  increased  for  dTTP  and  dATP  misincor¬ 
porations  but  was  similar  to  WT  enzyme  for  dGTP  misincorpora¬ 
tion  (Table  3). 


Table  2.  Steady-State  Kinetic  Parameters  for  Incorrect 
Incorporation  into  a  Gapped  Oligo  Substrate  by  Wild-Type  pol  |5 
and  Variants3 


Enzyme 

K a,  (s-*)b 

^wi,dCTP  (pM) 

^cat /^m, dNTP  (s  1  M  *) 

dG-dTTP 

WT 

1.45 

o 

X 

cT 

o 

© 

+1 

52.20 

(±1.65) 

27.78  (  +  0.50) 

p.K27N 

1.40 

o 

X 

o 

o 

+1 

83.78 

(±3.23) 

16.71  (±0.47) 

p.E123K 

1.47 

o 

X 

r-'. 

o 

o 

+1 

47.60 

(±3.73) 

30.88  (±0.99) 

p.E232K 

1.39 

o 

X 

rO~ 

O 

o 

+1 

59.69 

(±1.31) 

23.29  (±0.40) 

p.P242R 

1.39 

o 

X 

o 

o 

+1 

60.98 

(±1.72) 

22.79  (±0.43) 

p.E216K 

1.32 

(±0.03)  x  10-3 

85.21 

(±14.33) 

15.49  (±2.18) 

p.M236L 

1.38 

(±0.08)  x  10-3 

45.02 

(±5.79) 

30.65  (±2.30) 

dG-dATP 

WT 

1.31 

o 

X 

cT 

o 

o 

+1 

46.78 

(±1.90) 

28  (±1.56) 

p.K27N 

1.28 

(±0.04)  x  10-3 

74.59 

(±5.55) 

17.16  (±0.81) 

p.E123K 

1.31 

(±0.02)  x  10-3 

38.82 

(±4.65) 

33.75  (±3.86) 

p.E232K 

1.26 

(±0.05)  x  10-3 

59.16 

(±6.30) 

21.3  (±1.43) 

p.P242R 

1.22 

(±0.04)  x  10'3 

57.13 

(±2.47) 

21.35  (±1.51) 

p.E216K 

1.10 

o 

X 

Os 

o 

d 

+1 

72.44 

(±14.30) 

15.18  (±1.88) 

p.M236L 

1.32 

(±0.04)  x  10-3 

39.67 

(±2.44) 

33.27  (±1.08) 

dG-dGTP 

WT 

9.78 

(±0.20)  x  10“4 

446.41 

(±15.77) 

2.19  (±0.08) 

p.K27N 

8.82 

(±1.84)  x  10-4 

767.89 

(±256.68) 

1.15  (±0.13) 

p.E123K 

9.99 

(±0.60)  x  10“4 

457.59 

(±47.18) 

2.18  (±0.11) 

p.E232K 

7.35 

(±0.95)  x  10“4 

322.91 

( ±  104.29) 

2.28  (±0.42) 

p.P242R 

7.98 

(±0.15)  x  10-4 

430.51 

(±23) 

1.85  (±0.09) 

p.E216K 

8.29 

(±0.54)  x  10“4 

748.76 

(±88.86) 

1.11  (±0.07) 

p.M236L 

1.16 

O 

X 

o 

b 

+1 

478.25 

(±48.46) 

2.43  (±0.21) 

aThe  results  represent  the  mean  of  at  least  three  independent  determinations  ± 
standard  error. 

Calculated  using  total  protein  concentration. 

WT,  wild-type. 


Relative  Fidelity  of  Variants 

We  defined  fidelity  as  the  ratio  of  the  sum  of  catalytic  efficiency 
of  correct  and  incorrect  nucleotide  incorporation  over  the 
catalytic  efficiency  of  misincorporation  (Table  4).  Then,  relative 
fidelities  for  all  variants  were  obtained  by  dividing  the  fidelity  of 
each  variant  over  WT  fidelity,  using  the  data  from  Table  4  (Fig.  3). 
The  fidelity  of  p.K27N  was  similar  to  that  of  WT  enzyme  for  all 
misincorporations.  The  fidelity  of  p.E123K  was  similar  to  that  of 
WT  enzyme  for  dATP  misincorporation,  but  increased  for  dTTP 
and  dGTP  misincorporation  (Table  4  and  Fig.  3).  The  DNA 
synthesis  fidelity  of  variants  p.E232K,  p.P242R,  and  p.E216K  was 
increased  for  all  misincorporations  compared  to  WT  (Table  4  and 
Fig.  3).  In  contrast,  the  p.M236L  variant  showed  decreased  fidelity 
for  all  misincorporations  (Table  4  and  Fig.  3). 

Deoxyribose  Phosphate  Lyase  Activity 

Unlike  the  remaining  polymerase  variants  analyzed  here, 
p.K27N  is  part  of  the  deoxyribose  phosphate  (dRP)  lyase  domain 
of  pol  P  [Dalai  et  al.,  2008].  Thus,  we  compared  the  dRP  lyase 


Table  3.  Percentage  Changes  of  kcat,  /Cuintp.  and  Catalytic 
Efficiency  [kcat/KmMTp)  Compared  to  Wild  Type 


Enzyme 

kcat 

Km, dNTP 

kcat/K^dNTP 

dG-dTTP 

p.K27N 

60%  (t) 

40%  (1.) 

p.E123K 

- 

- 

11%  (T) 

p.E232K 

4%  (|) 

14%  (t) 

16%  (J.) 

p.P242R 

4%  (|) 

17%  (t) 

18%  (X) 

p.E216K 

9%  (1) 

33%  (t) 

44%  (J.) 

p.M236L 

- 

- 

10%  (t) 

dG-dATP 

p.K27N 

59%  (t) 

39%  (1) 

p.E123K 

- 

17%  (j) 

21%  (t) 

p.E232K 

- 

26%  (t) 

24%  (1) 

p.P242R 

7%  (1) 

22%  (T) 

24%  (1) 

p.E216K 

16%  (|) 

55%  (T) 

46%  (1) 

p.M236L 

- 

15%  (X) 

19%  (T) 

dG-dATP 

p.K27N 

_ 

72%  (T) 

47%  (1) 

p.E123K 

- 

- 

- 

p.E232K 

25%  (1) 

28%  (X) 

- 

p.P242R 

18%  (1) 

- 

16%  (1) 

p.E216K 

15%  (1) 

68%  (T) 

49%  (1) 

p.M236L 

19%  (t) 

- 

- 

— ,  not  significantly  different;  f, 

significantly  increased;  j, 

significantly  decreased. 

Table  4. 
Opposite  G 
Variants 

Incorrect  Incorporation  Fidelity  of  a  Single  dNTP 
in  a  Gapped  Oligo  Substrate  by  Wild-Type  and 

Mispair 

Enzyme 

GT 

GA 

GG 

Fidelity3 

WT 

p.K27N 

p.E123K 

p.E232K 

p.P242R 

p.E216K 

p.M236L 

1.16  (±0.03)  x  104 
1.12  (±0.03)  x  104 
1.37  (±0.05)  x  104 
2.23  (±0.04)  x  104 
1.67  (±0.03)  x  104 
2.32  (±0.33)  x  104 
8.91  (±0.06)  x  103 

1.16  (±0.07)  x  104 
1.09  (±0.05)  x  104 
1.26  (±0.15)  x  104 
2.44  (±0.16)  x  104 
1.78  (±0.12)  x  104 
2.37  (±0.28)  x  104 

8.2  (±0.03)  x  103 

1.48  (±0.05)  x  105 
1.63  (±0.19)  x  105 
1.95  (±0.10)  x  105 
2.29  (±0.43)  x  105 
2.06  (±0.10)  x  105 
3.23  (±0.19)  x  105 
1.12  (±0.10)  x  105 

aFidelity  was  calculated  as  described  under  “Experimental  Procedures”  and  from 
Table  1  and  Table  2. 

WT,  wild-type;  — ,  not  significantly  different. 
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„  dCrdTTP  dC:dATP  dG:dGTP 

a.  2.S  -  -  - 


Figure  3.  Relative  fidelity  of  pol  (3  enzyme  variants.  WT  and  mutant  variants  were  assayed  on  a  single-nucleotide  gapped  DNA  substrate  with 
a  templating  dG.  Fidelity  was  calculated  as  described  under  "Experimental  Procedures"  and  listed  on  Table  3.  These  data  represent  the  mean  of 
at  least  three  independent  determinations. 


Table  5.  Steady-State  Kinetic  Parameters  for  pol  p 
Deoxyribose  Lyase  Activity 


Enzyme 

Kat  (min-1) 

Km  (nM) 

Kcat / Km  (nM  1  min  *) 

WT 

4.1  x  10-3 

262 

1.6  x  10-5 

p.K27N 

1.1  x  10-3 

77 

1.4  x  10-5 

Note:  the  kinetic  values  presented  are  calculated  in  the  absence  of  APE  I,  as  detailed  in 
Materials  and  Methods. 


activity  of  p.K27N  to  wild  type,  with  or  without  prior  addition  of 
apurinic  endonuclease  I  (APE  I).  The  results  showed  that  the 
p.K27N  mutant  significantly  decreased  both  the  Km  and  kcat, 
resulting  in  a  small  decrease  in  catalytic  efficiency,  Kcat/Km 
(Table  5).  A  similar  decrease  was  observed  following  APE  I 
treatment  (data  not  shown). 

Discussion 

We  have  uncovered  a  significant  prevalence  of  missense 
nucleotide  substitutions  of  pol  P  in  prostate  cancer  tissue 
[Makridakis  et  al.,  2009].  In  order  to  assess  the  potential  effect 
of  these  seven  nucleotide  substitutions  (Table  1)  on  pol  P 
function,  we  utilized  expression  analysis  followed  by  kinetic  assays 
on  a  DNA  substrate  containing  a  single  nucleotide  gap.  Our 
results  demonstrate  a  significant  change  in  the  catalytic  activity 
(measured  by  the  kcat)  for  both  the  triple  mutant  (p.P261L/ 
p.T292A/p.I298T)  and  p.E123K  variants  (Table  1):  the  triple 
mutant  has  dramatically  decreased  activity,  while  the  p.E123K 
shows  significantly  increased  activity.  The  Km  for  the  correct 
dNTP  substrate  (dCTP)  was  significantly  increased  (i.e.,  the 
affinity  was  lower)  for  the  triple  mutant,  p.K27N  and  p.M236L 
variants  (Table  1).  The  changes  in  pol  P  catalytic  properties  caused 
by  these  mutations  resulted  in  significant  reductions  in  catalytic 
efficiency  (kCat/TFm,dCTp)  for  both  the  triple  mutant  and  the 
p.K27N  variants  (Table  1  and  Fig.  2).  Other  somatic  variants  (such 
as  the  p.E123K  and  p.E232K)  showed  increased  catalytic  efficiency 
compared  to  WT  (Fig.  2),  but  this  trend  is  within  the  experimental 
error  of  the  kinetic  assays. 

With  the  exception  of  the  triple  mutant  (whose  expression  was 
about  half  the  WT  level)  all  somatic  variants  resulted  in  essentially 
normal  protein  steady-state  levels  (data  not  shown).  Thus,  any 


change  in  the  observed  kinetic  values  reflects  an  alteration  in  the 
catalytic  properties  of  pol  p,  and  not  merely  a  change  in  enzyme 
levels.  This  is  also  true  for  the  triple  mutant,  whose  catalytic 
efficiency  is  reduced  orders  of  magnitude  more  than  the  twofold 
decrease  in  enzyme  levels  (Fig.  2).  Catalytic  efficiencies  at  steady- 
state  levels  provide  a  better  measure  of  the  potential  effect  of  the 
somatic  mutations  on  prostate  tumor  progression  rather  than 
initial  velocity  or  rates  (kpoi),  because  they  reflect  the  effect  of 
these  mutations  over  time  (i.e.,  while  the  mutation  is  present  in 
the  tumor).  As  previously  mentioned,  both  higher  and  lower  pol  P 
activity  or  levels  result  in  increased  mutagenesis  in  vivo.  Even  a 
50%  reduction  in  pol  P  activity  or  levels  results  in  single-stranded 
DNA  breaks,  chromosomal  aberrations  and  mutagenicity  [Cabelof 
et  al.,  2003],  Based  on  their  significant  reduction  on  pol  P  catalytic 
efficiency  (Fig.  2),  both  the  triple  mutant  and  the  p.K27N  variants 
are  expected  to  result  in  mutagenicity,  and  to  contribute  to 
prostate  cancer  progression  in  vivo.  The  effect  of  the  somatic 
mutations  that  marginally  increase  pol  P  activity  (such  as  p.E123K 
and  p.E232K;  Fig.  2)  is  less  clear.  An  intriguing  possibility  is  that 
some  of  these  mutations  affect  pol  P  binding  to  other  components 
of  the  BER  machinery,  such  as  XRCC1  [Sweasy  et  al.,  2006]. 

Interestingly,  the  p.E123K,  p.E232K,  and  triple  mutant  variants 
are  100%  prevalent  in  their  respective  prostate  tumors  (i.e.,  there 
is  no  WT  allele  in  each  patient’s  tumor)  [Makridakis  et  al.,  2009] . 
The  dramatic  reduction  in  pol  P  activity  conferred  by  the  triple 
mutant  (Fig.  2),  means  that  the  patient  carrying  this  mutation  has 
severely  defective  short-patch  base  excision  repair  in  its  tumor. 
This  patient  is  51  years  old  and  has  pT3a  stage  tumor  [Makridakis 
et  al.,  2009].  Prostate  cancer  usually  occurs  in  men  older  than  60 
years  [Bruner  et  al.,  1999],  suggesting  that  this  pol  P  variant  may 
contribute  to  early-onset  prostate  cancer. 

The  pol  P  knockout  mouse  is  neonatal  lethal  (due  to  respiratory 
failure),  shows  growth  retardation,  and  displays  apoptotic  cell 
death  in  the  developing  nervous  system  [Sugo  et  al.,  2000].  The 
finding  of  a  severely  defective  pol  P  variant  (the  triple  mutant)  in  a 
prostate  cancer  patient’s  tumor  suggests  that  pol  P  activity  is  not 
essential  in  the  adult  prostate.  Alternatively,  it  may  be  that  this 
patient’s  tumor  has  found  alternative  ways  to  compensate  for  the 
loss  of  pol  P  activity  (such  as  activation  of  the  long-patch  base 
excision  repair,  which  can  also  repair  oxidation  damage). 

The  triple  pol  p  mutant  (p.P261L/p.T292A/p.I298T)  changes 
residues  that  are  presumed  to  be  important  for  both  function 
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(T292;  see  Introduction)  and  structure:  P261  forms  a  structurally 
important  hydrogen  bond  with  glutamine-264  [Pelletier  et  al., 
1996].  P261  lies  one  residue  away  from  the  prostate  cancer 
associated  pol  (3  variant  p.I260M  [Dalai  et  al.,  2005],  Interestingly, 
the  p.I260M  mutant  is  a  sequence-specific  mutator  that  is  though 
to  exert  its  effect  through  the  disruption  of  the  hydrogen  bond 
between  proline-261  and  glutamine-264  [Dalai  et  al.,  2005].  Thus, 
the  p.P261L  mutation  of  the  triple  mutant  may  also  be  a  mutator 
mutant.  Future  biochemical  analyses  with  the  three  independent 
mutations  that  compose  the  triple  mutant  may  shed  some  light  on 
the  potential  functional  effects  of  each  of  these  variants  on  base 
excision  repair. 

Furthermore,  the  triple  pol  P  mutant  affects  an  enzyme  domain 
that  has  been  previously  shown  to  contain  functionally  important 
mutations:  the  p.E295K  pol  P  mutation,  found  in  gastric  cancer, 
abolishes  enzyme  activity  and  induces  cellular  transformation 
[Lang  et  al.,  2007].  The  p.M282L  pol  P  mutation  increased 
mutagenesis  and  protein  stability  [Shah  et  al.,  2001].  The 
p.K289M  mutation,  found  in  colon  cancer,  also  induced 
mutagenesis  [Lang  et  al.,  2004] .  These  data  suggest  that  the  triple 
pol  P  mutant  may  have  severe  consequences  in  vivo. 

Table  3  demonstrates  that  most  of  the  somatic  pol  P  mutations 
assayed  here  in  misincorporation  assays  significantly  change  the 
Km  for  dNTP  compared  to  WT,  but  not  the  fccat.  The  biggest 
changes  on  the  kcM  for  dG:  dGTP  are  afforded  by  the  p.E232K 
mutant  (25%  decrease)  and  the  p.M236L  mutant  (19%  increase) 
(Table  3).  fccat  for  the  rest  of  dNTPs  is  affected  very  modestly  for  all 
mutants  (less  than  16%;  Table  3).  However,  the  Km  for  dNTP 
misincorporation  is  increased  up  to  60%  for  dTTP,  59%  for  dATP 
and  72%  for  dGTP  (all  for  the  p.K27N  mutant).  Interestingly, 
most  somatic  mutations  increase  the  Km  for  dNTP  misincorpora¬ 
tion  (Table  3).  The  biggest  decrease  in  the  for  dNTP 
misincorporation  is  attained  by  the  p.E232K  mutant:  a  28% 
decrease  in  the  Km  for  dGTP  misincorporation. 

The  catalytic  efficiency  (fcCat/^m,dNTp)  for  dNTP  misincorpora¬ 
tion  is  significantly  lower  than  WT  for  most  of  the  somatic 
mutations  (Table  3).  The  biggest  change  on  the  fcCat/^m,dNTP  f°r 
dTTP  misincorporation  is  achieved  by  the  p.E216K  mutant  (44% 
decrease),  whereas  for  dATP  it  is  a  39%  decrease  (by  p.K27N)  and 
for  dGTP  it  is  a  49%  decrease  (by  the  p.E216K  mutant). 

Consistent  with  the  change  in  catalytic  efficiency  for  misincor¬ 
poration,  most  somatic  pol  p  mutations  show  significantly 
increased  fidelity  of  DNA  synthesis  (Fig.  3),  suggesting  that  they 
function  as  antimutators.  The  p.K27N  mutation  that  demonstrated 
reduced  activity  for  correct  incorporation  (Fig.  2)  displays  no 
change  in  fidelity  compared  to  WT  (Fig.  3).  We  were  unable  to 
measure  fidelity  for  the  triple  mutant,  due  to  its  very  low  activity 
for  misincorporation  (data  not  shown).  The  p.E123K  mutation 
displays  significantly  increased  fidelity  for  dTTP  and  dGTP  (up  to 
32%),  but  not  dATP  (Fig.  3).  The  p.E232K,  p.P242R,  and  p.E216K 
mutations  display  significantly  increased  fidelity  (up  to  118%)  for 
all  dNTPs  (Fig.  3).  In  contrast,  the  p.M236L  variant  showed 
decreased  fidelity  compared  to  WT:  23%,  29%,  and  24%  for  dTTP, 
dATP,  and  dGTP,  respectively  (Table  4  and  Fig.  3).  Thus,  the 
p.M236L  mutation  may  confer  a  mutator  phenotype.  Interestingly, 
the  prostate  tumor  that  bares  the  p.M236L  mutation  has 
microsatellite  instability  [Makridakis  et  al.,  2009],  supporting  this 
hypothesis.  Furthermore,  the  effect  of  the  p.M236L  on  synthesis 
fidelity  may  be  due  to  the  destruction  (in  the  mutant)  of  a 
hydrogen  bond  important  for  template  binding  (see  Introduction). 

As  mentioned  above,  the  p.E123K  and  p.E232K  variants  are 
100%  prevalent  in  their  respective  prostate  tumors  [Makridakis 
et  al.,  2009].  These  somatic  variants,  together  with  the  p.E216K 


(present  in  52%  of  its  respective  tumor)  [Makridakis  et  al.,  2009], 
show  significantly  increased  fidelity  compared  to  WT,  and  thus 
may  function  as  antimutators.  Interestingly,  all  three  of  these 
variants  show  a  trend  towards  higher  pol  P  activity  (although  it 
does  not  reach  statistical  significance  in  our  assay;  Fig.  2).  It  is 
tempting  to  hypothesize  that  these  three  variants  may  actually 
reflect  the  response  of  prostate  tumors  to  chemotherapy:  DNA 
damage  (e.g.,  alkylation)  caused  by  chemotherapeutic  drugs  may 
actually  select  for  tumor  mutations  that  have  both  increased  pol  P 
activity  and  fidelity,  in  order  to  repair  the  damage.  We  do  not  have 
data  on  the  chemotherapeutic  regimen  given  to  these  patients,  so 
we  cannot  directly  probe  this  scenario  at  this  time.  However, 
increased  pol  P  expression  has  been  significantly  associated  with 
poorer  chemotherapeutic  response  and  prognosis  in  colorectal 
cancer  [Iwatsuki  et  al.,  2009].  An  alternative  model  is  that  the 
p.E123K,  p.E232K,  and  p.E216K  pol  P  mutations  are  actually 
“passengers,”  that  is,  not  “drivers”  of  tumor  progression. 

Several  previously  characterized  pol  P  mutants  exhibit  mis¬ 
incorporation  bias.  For  example,  the  p.D246V  pol  p  mutant, 
present  in  the  “flexible  loop”  (where  the  p.P242R  mutant  also  lies; 
see  Introduction),  preferentially  misincorporates  dTTP  opposite 
to  templated  dG  [Dalai  et  al.,  2004].  This  misincorporation  bias 
makes  the  p.D246V  a  mutator  mutant  mainly  for  C>T 
transitions.  We  examined  our  somatic  pol  P  mutants  for 
misincorporation  bias.  Table  4  indicates  a  similar  trend  for 
mutator/antimutator  status  for  all  template:  dNTP  misincorpora- 
tions  and  all  somatic  mutations  that  affect  fidelity  of  DNA 
replication  (including  the  p.P242R).  Thus,  we  conclude  that  these 
somatic  mutations  do  not  result  in  significant  misincorporation 
bias.  However,  pol  P  misincorporation  bias  is  also  known  to 
depend  on  sequence  context.  Thus,  it  is  possible  that  varying  the 
template  sequence  may  result  in  distinct  misincorporation  bias  for 
specific  mutants.  Future  experiments  will  test  this  hypothesis 
following  expression  analysis  of  the  somatic  pol  P  mutants  in  vivo. 

The  triple  pol  P  mutant  dramatically  affects  the  Km  for  dCTP 
(15-fold  increase).  None  of  the  residues  that  bind  dCTP  (based  on 
the  crystal  structure)  [Sawaya  et  al.,  1997]  is  directly  mutated  in 
the  triple  mutant.  However,  D276  of  pol  p,  binds  dCTP  [Sawaya 
et  al.,  1997],  and  is  also  part  of  an  a-helix  that  is  in  close 
proximity  with  two  stacked  P-sheets  that  include  two  of  the 
residues  mutated  in  the  triple  mutant,  p.T292  and  p.I298  (Jmol; 
http://molvis.sdsc.edu/fgij/fg.htmimol  =  2FMS).  Mutations  of 
these  residues  from  threonine  to  alanine  (at  position  292)  and 
from  isoleucine  to  threonine  (at  position  298)  may  destabilize  the 
local  structure,  perhaps  reducing  the  stacking  effect  (especially  the 
I298T  mutation)  and  thus  the  interaction  between  the  P-sheets 
and  the  a-helix  containing  p.D276.  This,  in  turn,  could  affect  the 
triple  mutant  enzyme’s  affinity  for  dCTP. 

The  p.K27N  pol  P  mutant  significantly  decreases  catalytic 
efficiency  (kCat/^m,dcTp)  without  changing  kcat  (Table  1).  The 
p.K27N  effect  on  the  pol  P  catalytic  efficiency  can  be  explained  by 
a  72%  increase  in  the  Km  for  dCTP  (Table  1).  The  p.K27N 
mutation  is  not  physically  close  to  any  of  the  residues  that  bind 
dCTP  [Sawaya  et  al.,  1997]  (Jmol;  http://molvis.sdsc.edu/fgij/ 
fg.htmfmol  =  2FMS).  However,  the  Km  for  dCTP  can  increase  due 
to:  (1)  an  increase  in  the  dissociation  constant  Kj  (for  dCTP),  (2) 
slower  rate  of  dCTP  insertion,  or  (3)  decreased  binding  affinity 
for  the  DNA  template.  K27  of  pol  P  is  only  6  Angstroms  away 
from  the  DNA  template  (Jmol;  http://molvis.sdsc.edu/fgij/ 
fg.htmfmol  =  2FMS).  The  p.K27N  mutation  abolishes  a  positive 
charge  on  lycine  27,  which  may  destabilize  the  interaction  between 
this  residue  and  the  negatively  charged  DNA  template  backbone, 
resulting  in  lower  affinity. 
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Figure  4.  Location  of  the  somatic  pol  |3  mutations.  The  structure  of 
human  DNA  polymerase  beta  with  dUTP  (orange  crosses)  opposite  dA 
and  a  gapped  DNA  substrate  (2FMS)  is  shown  using  Jmol  (http:// 
molvis.sdsc.edu/fgij/fg.htm?mol  =  2FMS).  Yellow  "halos"  indicate  the 
side  groups  of  the  mutated  residues.  The  gapped  DNA  substrate  is 
shown  in  green  and  yellow,  the  polymerase  (3  in  gray.  Small  green 
spheres  indicate  Mg2+  ions,  large  green  ones  Cl  ions,  and  purple 
ones  Na+  ions. 


A  similar  scenario  may  explain  the  effect  of  the  triple  mutant  on 
the  Km  for  dCTP  (alternatively  explained  above).  The  p.T292A 
mutation  present  in  the  triple  pol  (3  mutant  is  expected  to  abolish 
a  hydrogen  bond  between  this  pol  (3  residue  and  the  DNA 
template  (see  Introduction),  which  may  result  in  decreased  affinity 
for  the  DNA  template.  This,  in  turn,  could  result  in  increased  Km 
for  dCTP  (as  mentioned  above  for  the  K27N  mutant). 

Unlike  the  dramatic  decrease  on  dRP  lyase  activity  caused  by 
the  previously  characterized  p.L22P  pol  (3  mutant  [Dalai  et  al., 
2008],  the  p.K27N  variant  characterized  here  shows  a  small 
decrease  in  catalytic  effiency  (Table  5).  This  finding  may  be  due  to 
the  positioning  of  the  side  chain  of  K27,  which  points  away  from 
the  lyase  active  site  [Prasad  et  al.,  2005]. 

The  crystal  structure  of  human  pol  (3  is  available  [Pelletier  et  al., 
1996].  Visualization  of  the  somatic  mutations  that  we  character¬ 
ized  here  in  pol  (3  structure  (Fig.  4)  indicates  that  these  variants 
are  not  in  a  specific  part  of  the  protein,  but  they  are  distributed 
throughout  the  structure.  Some  of  the  mutant  residues  are  in  areas 
of  protein-DNA  interaction,  others  in  areas  of  interaction 
between  protein  domains  (e.g.,  between  two  (3-sheets),  whereas 
others  are  in  areas  critical  for  structural  maintenance  [Pelletier 
et  al.,  1996].  These  observations  suggest  that  multiple  structural 
parts  of  pol  (1  are  critical  for  BER  function.  The  observation  of 
several  pol  P  mutants  that  affect  DNA  synthesis  fidelity  without 
been  part  of  the  active  site  is  of  particular  interest. 

In  summary,  we  biochemically  analyzed  all  missense  somatic 
mutations  of  pol  P  that  we  previously  identified  in  prostate  cancer 
[Makridakis  et  al.,  2009].  We  report  that  all  missense  somatic 
pol  P  mutations  have  functionally  significant  effects:  the  triple 
mutant  and  the  p.K27N  variants  affect  catalytic  efficiency,  while 
the  p.M236L,  p.E123K,  p.E232K,  p.P242R,  and  p.E216K  mutations 
alter  the  fidelity  of  DNA  synthesis.  These  somatic  mutations  are 
present  in  a  total  of  7  out  of  26  (27%)  prostate  cancer  patients 
[Makridakis  et  al.,  2009] .  If  one  adds  to  this  total  the  two  patients 
with  splice  junction  mutations  (that  are  predicted  to  result  in 
amino  acid  deletions)  [Makridakis  et  al.,  2009],  then  we  conclude 


that  9  of  26  (35%)  of  prostate  cancer  patients  have  functional 
somatic  mutations  of  pol  p.  Functional  pol  P  mutations  have  been 
identified  at  a  high  frequency  in  other  types  of  cancer  [Starcevic 
et  al.,  2004].  Thus,  interfering  with  pol  P  activity  may  be  a 
common  mechanism  of  carcinogenesis.  Moreover,  our  data 
significantly  expands  the  current  knowledge  on  the  molecular 
determinants  of  both  activity  and  fidelity  for  a  model  monomeric 
eukaryotic  polymerase,  pol  p. 
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